ERRATA - 2:55 This title is so brief as to be confusing, it should say: "More specific requests, less useful responses" - 3:53 Wrong name: it’s more auto-complete than auto-correct. Think predictive text, not your phone deciding “ducking” is your favourite word. - 4:32 While ChatGPT does indeed solve this equation for j, it's not the language model solving it, these are the 'guardrails' I mentioned: The frontend detects a formula that needs solving and switches into a theorem solving mode. It's a perfect example use of GPT here: use a hardcoded mathematics system on the backend but feed it into the best natural language processing system we have to interact with the human.
A few days ago a commenter asked it about some pretty basic web stuff and o1-preview hallucinated a CDN URL to a framework extension that doesn't exist. GPT is GPT, even if it is very clever, it still is subject to the cautious I outlined in this video.
A few days ago a commenter asked it about some pretty basic web stuff and o1-preview hallucinated a CDN URL to a framework extension that doesn't exist. GPT is GPT, even if it is very clever, it still is subject to the cautious I outlined in this video.
@@NoBoilerplate It's not the model solving it, no, but the model is comprehending what the user is asking, then creating a plan of action behind the scenes, then using appropriate tools. This is fine, especially considering where the technology is currently. Newer models are starting to catch their mistakes before giving the user output. But no, LLMs themselves aren't enough to answer complex (and sometimes simple) problems. And as it stands, they shouldn't be.
@@monad_tcp so its kinda stupid to say "The reason AGI is ALWAYS just "three to five years away" is because all these Startups have three to five years financing" when theyve only been saying that for a few years. versus something like "omg global warming is going to flood the entire world by 1990. i mean 1995. i mean 2000. oops i meant 2010. ok ok i actually meant 2020."
As someone who works in and for massive companies drowning in daily GenAI promises I have found it hard to succinctly articulate my apprehension for most of the presented and sold use cases leveraging Large Language Models. The idea, paraphrased from this video, that "Large Language Models deal with Language, not Knowledge" really distills it down to a short and clear truth. This perspective should make it easier to argue about when it is a bad idea to rely on these systems. Thank you!
This is misleading. Technically they deal with tokens which can literally be anything, any value, any language. The only reason they work is their ability to compress language into abstractions ie knowledge. The knowledge the reasoning is what remains. Look inside an LLM… do you even see any words? These same systems work across many domains and are incredibly good at maths now. This video feels like it’s from 2020
@@plaiday"incredibly good at math" is a little misleading... apple's paper on gsm8k symbolic shows how llms can still be strongly influenced by language and are no proper reasoners, even with todays strongest models from anthropic, openai, meta etc. and even if the models are strong enough to produce strong math results with some accuracy, the issue with hallucination and overconfidence remain a strong point of apprehension against these systems. this issue is only worse in the stronger reasoning models (o1, o1 pro).
As someone with an extremely messy mind, I find LLMs great for laundering my thoughts, picking out bullet points to focus on, but after that I disengage, actual work and study, wholly up to me.
Here's an autism superpower of LLMs: "Hey chatgpt, what does this cryptic email/message actually mean, I feel like I'm not getting what they are trying to say"
2:15 As someone who works in this field, I have preached to anyone who would listen that I think the one truly revolutionary use for these LLMs is as a pre-Googler or a Google-prompt-engineer. The AI generated responses that Google searches give are either exact plagarisms of the top results or utterly useless. If instead a user wasn't exactly sure what they were looking for they instead could ask an LLM to search for it for them (as I very often use Claude for myself) such as "I'm looking for a kind of flower that is usually red with thorns, but is not a rose" or "Is there a name for ". In these situations I've found many of the top LLMs to be unbelievably invaluable, and there's certainly a market for a better way to search the internet at the moment.
Absolutely. My usual intensive Google search involved trying multiple plausibly related terms (more like tags than ideas), opening 10-20 results for each and scanning them for useful information that might be more closely related and then searching again with more terms. Now I can just ask an LLM for an overview, have it expand on what I'm really looking for, and if I need critical information I can search for and verify it directly much faster.
It's useful for rewriting things too. Personally I've used it to help develop a meta-ethical framework. It doesn't get everything right, but neither do humans. Graduate level people don't necessarily give better responses, and when you need "someone" to bounce ideas off, or suggest issues, it can be useful. At the moment the human needs to be the one taking charge of the thinking though and correcting the LLM. Haven't LLM's done things like pass various graduate level exams? And it seems like GPT o1 can do things like maths and PhD level questions pretty well, no?
It's great for new topics where I don't yet know the terms / keywords to know exactly how to phrase what I want to know. LLMs can figure out what I'm trying to say and give me the terminology for me to then go and use in a search engine.
For a while I had hard pushes at my company to incorporate AI into our tech stack somewhere. I always pushed back as "You don't understand what these things really are and therefore why they are incompatible with our business and services. Clients expect us to be CORRECT ~100% of the time, and we get grief whenever we miss something. LLMs are not useful to us". I got a lot less grief from others once the first examples of lawyers being sanctioned and companies being legally obligated to provide services their LLM support bot assured customers were available surfaced. It seems like the hype cycles on these technological fads gets shorter and shorter over time. Does anyone else experience this?
Yeah they had to whip out AI in a hurry after NFTs flopped. There's a certain sector of the tech industry that lives or dies by scamming investors, so if AI ever actually goes bust, they'll be back at it with something else in a month or two
At least, ChatGPT has been incredibly useful to my journey to learning the basics of linux in the past year. As you said, AIs gives fairly decent responses when it comes to simple stuff, which is what I need it for: How to make a script? How to upscale my screen resolution? How to run this program from source?...
@@plaidayStuff that it doesn’t have a large amount of training data on. Common Linux commands it will do great on, for example, but as you get more and more specific/use more obscure libraries it will break down more and hallucinate since it has away less training data.
I would argue that it could be potentially harmful as you’re learning Linux, especially as you move beyond being a beginner. A lot of the important parts of Linux come with learning how to read documentation and understand how specific packages/tools work. Take the Arch Linux documentation for example, it is intended for Arch but a lot of knowledge can be applied in broad strokes. I imagine that ChatGPT pulls a lot from the arch docs but what gets missed there is a centralized source of information that is explore-able. Sure, you can ask ChatGPT what command to run to restart the network manager but eventually you’ll be looking for more detail than that and imo the better learning experience is knowing which docs to check, and looking through the examples. In that case you’re getting information from the developers of the tools themselves which is often much faster and doesn’t require fact-checking because you’re getting it from the source. You become your own GPT and can start to infer what flags you’ll need for a command and a simple check in the docs that takes 10 seconds vs the thirty seconds to form a prompt, try the output, then pasting in the first error code you get
I've found another weird thing is that when I say "As a computer scientist with a reasonable understanding of what these do and how they work, what these companies are promising is impossible." people are very un-receptive to it. They tend to count such arguments as equally valid and well reasoned as those coming from business people making the false promises.
@@jeremydiamond8865 That's because people are looking into the future not the current time. They see the current state of AI and just extrapolate into the future of how great it's going to be when they don't have to think or do anything as the computer will handle all that. This creates a condition where people idealizes AI and need it to work (basically false Hope). This is also the same effect you get when discussing politics. People will use idealistic scenarios of why the political system they wish to be in place will bring about a Utopia even if the evidence and past trials say otherwise.
@@asandax6 Sure, ideals can seem _pie in the sky,_ but checking oneself against some big, fixed object up there can be used to do some real things. Like, for example, navigating an ocean crossing. Without ideals, we're wandering around in (imperfect) circles without a compass. So go easy on our theoretical models, mkay? From the governmental and economic systems we adopt, to our concepts of 'happy' and 'healthy' and 'good,' to consciousness itself, ideals are pretty much all we've got to orient ourselves here. Evidence acquired from "past trials" is only half of the cleverest way forward. All ducks are brown only until you see a white one. Giving up the _a priori_ also means giving up mathematics, and for that matter, pure logic, and reason. Let's shoot for the Moon, but expect a bit less. Sometimes it is fair to question whether a specialist is too close to the subject to see a bigger picture.
@@asandax6 you’re correct. A lot of people, even those who work in computer science see AI as something that WILL happen. Disregard what it can or can’t currently do, it will eventually “learn” how to do everything. It makes having a conversation about it difficult
@jeremydiamond8865 "I think there is a world market for maybe five computers." Thomas Watson, president of IBM, 1943. As a cybersecurity architect writing secure bootloader and trustzone code for a Tier 1 ISP, although I agree almost entirely with the video, I believe AGI can be delivered but that it will take a few decades. Why? The LLM hype train has to die. Normal people are incapable of understanding that LLMs are statistical language, but the science continues to improve in every perceptual field such as logic, reasoning, pattern recognition, categorization, and other perceptual or cognitive concepts which, once unified, will form the basis for a computational intelligence. The science has not abandoned any of these things, but the money and the marketing just isn't there to get everything together and keep the brilliant ones on task. Still, it's not "impossible," it's "impossible for a language model."
I use little AI tools I've made myself on the regular using my local llamafile. The key to using an LLM is exactly what you said in the video: Acknowledging that it's a language processor and nothing more. I have autism, so I have tools that use LLMs to convert my thoughts into more NT friendly words, and vice versa. My thoughts are often quite scattered, so I use an LLM to compile those thoughts into more sensible lists. I'm working on a Computercraft turtle in Minecraft that I can write in the chat to and make it do things like repairing a base. I use the LLM to process my commands more accurately than any keyword search I'd write could, then that calls back to real code with real parameters to do actual tasks after confirming that's what I actually wanted. AIs can be amazing, as long as they're not misused
A few days ago I asked it about some pretty basic web stuff and o1-preview hallucinated a CDN URL to a framework extension that doesn't exist, however the code the hallucinated extension very much works because it is part of the basic functionality of the framework. I hope people see how dangerous this is, because now I can just make a CDN serve /framework/extensions/[common framework topic].min.js which just contains a bunch of malware and devs won't even know they owned themselves. This is their best offering.
Your point on anything with low amounts of training data is spot on I was asking Claude AI and ChatGPT about the logging tool Fluentbit But they're only trained on Fluentbit v1 and v2 - not the current v3 which has a different syntax Extremely frustrating to work with
Oh, thank you - I've been experiencing this same issue with some Rust crates that are frequently updated - and the majority of the training data is obviously for old versions. So during a single session, it will go from giving a specific answer to reverting back to older API's as the questions get more specific. It is infuriating. The obvious reason is that the statistical model is going to see the syntax of the old API as being statistically more likely next keyword - and goes with that. Which is also problematic because there is not necessarily a way for the AI to know that the training data is from a specific version of the API.
Why don't you paste the docs in and then ask it questions about it? That's a much better way to use it than hoping it has access to the latest version of anything.
@@DodaGarcia I believe the problem with the current models is that they have too much data. They are very good with language, but do not really have an understanding where they lose predictive power. Maybe better have a smaller model only trained on language and some basic logic/data processing and put additional information like an bigger version of wikipedia. And than sample the information out of that and create the response of that data. Would give much better control and avoids hallucination.
1+1=2 is an equation. 2e^2+5j=0 is a curve. Math programs should be able to identify functions pretty easy. Just use the right program for the right task. But when we talk about AI we expect real understanding - not doing predefined tasks.
5:03 disappointingly this applies to a lot of peoples behaivors too: " seem good at first when you ask it simple questions but as you dig deeper they fall apart and get increasingly inaccurate or hit artificial guardrails and only provide surface level responses"
I mean that's valid. I know I have this problem just the same. There is an important difference though which is why the "race to AGI" is so heated. The junior fresh of an internship can actually finish whatever task they are assigned even if it's complicated (to a reasonable extent). At least the ones I've been around certainly can. The AI, no matter how much compute you throw at it, cannot. It doesn't matter how magnificently it can write that singular function with matching tests and ci/cd. These AI systems cannot plan. Period. They are incapable of taking an unknown task and breaking it down into more manageable chunks. Splitting them and creating more as needed until the task is complete. OpenAI's apparently claims that o1 can? Not that I've seen. You still need to do everything yourself. The AI just makes things go faster. But maybe that's a skill issue on my end. That's not lost on me. The key step is apparently AGI will be able to do what the junior can do. I don't see it happening without some major architectural overhauls. We have scaled out everything he currently can. Now it's a waiting game to scale compute.
People are capable of communicating their specifically-relevant limitations and showing humility, and I doubt many people would presume a random person to be authoritative as too many people seem to presume LLMs to be.
@@LiveType, AGI is just a name for a thing that nobody knows if it will ever exist. And there does not exist any solid reason for why it ever would. There are known problems to be solved until an AGI is possible, and those problems concern the very nature of knowledge and its representation. How do you make an electric signal or a number on a spreadsheet aware of itself or other signals or numbers? You need to answer this to realistically believe in the possibility of an AGI.
2:41 "The more specific the answers you want, the less reliable large language models are". Very well put! I would also add to that - the greater the delta between your expected precise answer and the vague prompt you give, the worse the LLMs are.
This phenomenon of selling promises to investors needs a boilerplate name that captures the imagination of the relevant audience to propel it to semantic immortality and ubiquity.
While LLMs may not be great at reasoning and suffer from hallucinations, I find it invaluable for summarizing long research papers, creating outlines, brainstorming, and helping me express my thoughts when I’m having trouble finding the words. I wish companies would advertise these advantages instead of promising stuff that isn’t true.
@@maybethisismarq The companies actually selling LLMs do advertise those features to make sales, e.g.: Microsoft has auto summarization as a product of their Teams platform, if a meeting is recorded and you pay for the Teams "Premium" you will get a summary of everything discussed in the meeting afterwards, it's a pretty good feature, and it's multilingual. But yeah, you won't hear the "AI" companies seeling this feature because when you think about it it's a feature that already needs a platform to be useful, or do you think if OpenAI started a Teams competitor now every company that already have contracts with Microsoft would migrate it to OpenAI? Most of the AI companies are advertising features that don't exist because they are selling the idea of those features to their investors as a miracle that will bring 100x more profit, if they were to present plans for actual things LLMs excel at investors would realize most of the AI companies don't have the platforms to apply the AI to, the market is already capped for those and the companies would actually have to have a business plan for this, and when investors see a 1.5x return on profits over 2 to 5 years they would not be interested. On the other hand, selling a promise for something magical in just some years that will get them 1000x ROI, on boy, they really want that... All in all I can see that Microsoft is one company that's actually integrating LLMs into useful things and getting real money from it. Facebook has been great due to their Open approach to releasing the Llama models that's sparked more open models approaches from other companies (Like Qwen from Alibaba and Exaone from LG).
I agree, and not just that, but their ability to code is extremely helpful. It can turn someone who doesn't have a previous background in the syntax of a new language an edge like never before.
In the anime Frieren: Beyond Journey's End, the way the show portrays demons is simply as monsters that can use human language, however they have no understanding of the meaning of whatever they're saying. They understand how humans react to certain uses of language, and they will simply say whatever would get the desired reaction. One demon might start talking about their father so that a human would become less aggressive toward them while also having no idea what a "father" even is. This strongly reminded me of modern language models. They don't ever say what's accurate or true, only ever what they think should come next (and considering how training can work, it's largely to get a desired reaction in the form of approval from human trainers). They're not artificial intelligence. They're language models and they do nothing but model language. The problem largely lies in many people mistaking language for intelligence. Just because something can use language, like language models, that doesn't mean that thing is intelligent. The reverse is also true, where some people can be dehumanised because of an inability to use language due to various disabilities.
I feel like I’ve been misled into thinking that LLMs are genuinely smart. They certainly do a great job of appearing intelligent, but there’s a big difference between truly understanding something and just predicting the most likely next word.
Is there? I think that statement would require us to have a concrete, objective description of what "truly understanding something" actually means that can be tested.
@@somdudewillson well, not really - if you try hard enough - you can trick an LLM into giving away the game. You can do things like ask it a maths question, then apply the same logic to another question but using a real world scenario to frame it - you suddenly realise it cannot apply reasoning cross domain. That is a simple to understand, and widely accepted principle of understanding. I you truly understand the concept, then presenting the same problem in a different context should be easy to solve. LLM's can fail at this.
“LLMs” are just predicting the next word” is a talking point that can be used by anybody wishing to discredit them. The decoding step, in which the hyper-dimensional array of information output by the neural network is transformed into readable text, uses probability to pick a series of words that best express that information. To say that the decoding process is the entire process literally ignores the existence of the neural network and what makes big ones smarter than small ones.
My favorite example of ChatGPT breaking down is actually when you ask it about a logic problem that has lots of variations present in the data. If you ask it about the classic puzzle where there's a goat, a wolf, and a cabbage, and you have to take them across the river in a rowboat, ChatGPT will give you a mishmash of answers to similar riddles and it sounds completely mad.
4:56 reminds me of my experience with asking GPT about programming stuff. Common libraries, general questions, usually you’ll get good answers, but as soon as you start to ask for something a little bit unusual it all falls apart. Hallucinations, random invalid logic, the works.
While I agree with many of the conclusions on how to think about current AI tools as a consumer, I think the analysis on the inherent limitations of gpt-style systems ignores a lot of the research going on atm. We know for example that llms do actually develop an internal world model. This has been very explicitly shown with Otello-GPT, a toy llm, that was trained on move sequences of the eponymous board game, where researchers where able to fully extract the state of the board game just by looking at the activation space. Recently further research has found similar results for non-toy models like Llama 2. Further research has to be done of course, but it might turn out that to become really good at predicting the next token, eventually you have to understand what you're writing about. There's a lot more going on of course and I'm definitely not arguing that there aren't still significant hurdles to overcome, but simply arguing that "this thing learns to predict language, therefore it can only ever understand language" isn't quite right either.
"understanding language" is an enormously powerful feature for a system to exhibit. Ultimately pure mathematics is just language with the additional constraint that valid grammar (i.e. constructing one's sentences from accepted axioms and inference rules) implies a "correct" (relative to axioms and inference rules) result. I think people need to remember, however, that this power embeds Turing-completeness into the system. And we know there are very rigid constraints on what is computable, and what problems appear to have infeasable trade-offs in their computability.
@@danielkruyt9475 exactly. "Language" is not merely a collection of words. It is embedded with knowledge. Understanding language implies some non-insignificang level of knowledge.
@@franciscos.2301 no. Language is not embedded with knowledge, it is used to communicate knowledge. Big difference. Large language models can create complex maps of word interrelatedness which may allow it to even appear to have the ability of inference, but they ultimately "know" nothing except how we string together words. Since we use words communicate logic, they can appear to be logical because they are able to put words together in ways that we would put words together.
Thanks for bringing this up. That was an interesting white paper. I appreciated this video but it did give the impression of being written from the perspective of an intermediate/advanced user, and not someone with machine-learning experience or background. Even from my cursory understanding about it, when it comes to domain or niche knowledge, for instance, I kind of thought "well what about RAG, chain-of-thought, or alternative and underexplored architectures besides transformers?" I really feel like deployment by commercial firms is overhyped and premature, obviously, but that doesn't mean that there isn't a ton of depth left to this rabbit hole. The idea that even GPT is essentially an outrageously trained autocorrect belies the fact that we actually still barely understand how these models are actually working, especially as they grow exponentially in parameters and scale; hence Otello-GPT.
2:15 Yes! I love this feature, I use it to reverse search a definition (and any other criteria like "word begins with the letter m") to find words that are on the tip of my tongue. That's it. I haven't found a good use of LLM/GPT/whatever anywhere else.
LLMs are good for advanced sentiment analysis if your concern is data science-y. Previous sentiment analysis used to attach positive and negative weights to words and then just count them up (eg, looking at a review of an airline company, "delay" would have a negative weight). But this lacks nuance in terms of both the domain and language quirks like sarcasm. Whereas LLMs are much more proficient at "reading the room". (Technical note: this is almost certainly due to the attention layers, that contextualise each word according to its neighbours, as well as just the whole "trained on the entire internet" thing)
Personally, it's an alight upgrade for duck-debugging/Stack overflow trawling to help tackle some of those basic hurdles you can run into. It's not perfect by any means, but as someone who doesn't directly code often, it's helped me throw some scripts together.
Dear Trist, Your video contribution is excellent. I am absolutely thrilled with the way you provide a technological, sociological and economic analysis of the current state of AI technology in such a short space of time. I have watched more than 8 comprehensive videos from renowned RUclips channels with academic analysis on the limits of AI. Your video absolutely sums it up in a short time without losing any of its importance and explosiveness. Outstanding achievement! Thank you for your contribution!
"The more specific the answers you want the less reliable LLMs are." So, fundamentally no different than asking a human, consulting a book, or searching for answers on message boards. I don't understand why the bar for artificial intelligence is so high while the bar for humans is so low. If you and ChatGPT took a quiz with a variety of questions from different domains of knowledge I am 100% confident that you will do worse, while taking about 100 times longer to do so. It's almost like the pearl clutching while screaming "BUT IT MAKES MISTAKES!!" is motivated by a pervasive deep-seated fear. 🤔
You are right of course, but the problem is that a general LLM isn't being pitted against the "average" person, but specific people. When you need a job done, you expend a great deal of effort to find the right person to fill that job opening. When you have a knowledge question, you can study specific books that contain an expert's knowledge of the subject. LLMs can compete when low level tasks, like doing schoolwork or, say, writing ad copy, but if it is going to be worth the monumental cost of hardware and power and human effort it has to be way, way better than the average person. Average people are cheap. The world is full of them. It has to be better than a trained and experienced person in a field in all fields that they wish to apply LLMs to.
I don't mind that it makes mistakes, the problem is that due to the nature of GPT is that it's hard to spot them. You're smart enough to know how to use it and where it falls short, but the companies name their products very deliberately (such as "Einstein") to give a misleading impression.
Yeah I really don't get that either. People can be people but AI has to be absolutely perfect. Do people not realize that once that happens, we all become expendable? Plus, the tech hasn't even reached its final stages yet. This is like complaining in the 1950's that computers are only good at doing calculations and nothing else. Yep, solid insight you got there buddy.
I'm glad you brought up blockchain as a point of comparison. Through this whole hype cycle, I've often been reminded about that bike share program (I believe it was in the Netherlands?) that was built "on the blockchain". It used the headline hype to gain funding and ran really well. The more credulous press looked into it later down the line, and the devs were upfront and honest that the software only used the blockchain in a purely incidental way. It was actually completely ordinary, boring, functional, valuable software. If we must live with AI hype, I hope it can be in that manner.
the zed code editor uses the ai selling point and I hope they just use it as that only. There’s a lot of promise in how the rest of the product works (performance and live-multi-person editing support) and I hope that in the end we get that with a little optional chatbot tucked away somewhere. Noone needs these things to be more than chatbots and slop generators.
AI sucks because the frameworks they're using are crude, and they're using massive knowledge-as-power models because that's what's easy for them. I think, once they make the leap from abstract neural intuition to formalised framework processing, then we can worry. And marvel again (It's still pretty good but we know the limitations now)
I found your comment interesting, so I asked ChatGPT o1 full model, - what does this mean to you? "the leap from abstract neural intuition to formalised framework processing" Its response, "this phrase often refers to the process of moving from a raw, intuitive understanding of something-like the way a human brain instinctively “gets” a concept-into a more rigorously defined, structured, and often mathematically or logically framed model. It’s the shift from “I know this pattern when I see it” (an abstract neural sense) to “I can describe and manipulate this pattern using a systematic set of rules, equations, or algorithms” (a formalized framework). In other words, it’s the journey from having a vague, gut-level feeling that something is true or meaningful to actually translating that feeling into a precise representation that can be analyzed, tested, and reliably applied." LOL good phrase!
I develop programs using llm's. It is not as bad as you suggest. In order to be useful at higher complexity levels:" design patterns, architecture, interrelation of classes, program flow, ..." you have to use certain technique and relay on .sh scripts. All in all thinking about it, it is a darn outlandish experience communicating with an Alien Intelligence.
Yeah, the things people criticize about LLMs are shocking to me. The technology is fucking extraordinary, and nobody is forced to use it anyway if they don't find it useful.
GODS I freaking love it! I dropped off about half way through the show, because I was watching it too fast, and Simon's backstory made me feel feelings. I should finish it
@@NoBoilerplate please watch it I've been watched a bunch of children's animated stuff I never watched when I was a kid and adventure time is by far the best. The finale made me want to bury myself in a hole and let nature reclaim me (positive)
@@NoBoilerplate Definitely worth it! One of my favorites. I haven't seen the last season because it's too good to die. While ranting to my friends years ago, I brought up that dragon aswell! Surreal to see someone make the same connection.
Try to get AI to draw elementary shapes for art students-spheres, cubes, cylinders, cones, and pyramids-without any additional elements. It cannot, no matter the model! ChatGPT explanation: While AI models like ChatGPT cannot directly "draw" shapes, AI-powered tools (e.g., DALL·E or MidJourney) can generate simple shapes. However, ensuring they are elementary and without additional elements can be challenging due to AI's interpretation of prompts, often adding artistic flair or extra details.
1:45 I've also seen that in a talk from Dylan Beattie. He said something like "Technology is what we call stuff that doesn't work. Once it works, it's no longer technology, it's just stuff." Although I think the quote originally came from Douglas Adams.
Rings very true. I love Douglas Adams, his books are a primary source for my writing for Lost Terminal, I wonder what you think of it? ruclips.net/video/p3bDE9kszMc/видео.html
@@NoBoilerplate I watched season 1 probably like a year ago. Thought it was neat. Kind of reminded me of Wolf 359, even thought it's pretty different. But I never got around to watching the rest of it. I'll see if I can get back to it eventually.
As a software developer working on pretty low level and template heavy code, AI (I've only used GPT-4o) is very good for a few quite specific use cases: 1. Cleaning up and summarising verbose compiler error logs which can be hundreds of lines for a single issue. 2. Generating very specific functions and/or metafunctions with a clearly defined set of inputs and outputs and how it should transform them. Trying to work with multiple successive prompts never really works, it's better to pessimistically assume the AI will not keep any knowledge of past prompts and just be happy when it remembers something useful from the session.
When I hear about a new flashy thing that’s supposed to materialise some day soon I’m thinking to myself: ok, sounds great. I believe it when I see it 😊
Speaking of which, did openAI ever release (or did anybody recreate) that horny chatbot mentioned in that one Rational Animations video? Asking for a friend of course.
I mean, you install a local LLM executor (e.g. ollama), download an "abliterated" model (safety-rails removed), and set the "temperature" of the model to something nice and spicy (like Pi^e, 22.4592) rather than some boring low number between 0.0 and 0.6 Have fun with your raunchy, scatterbrained, unhinged chatbot.
You are a smart fellow. I appreciate this take and very much agree with it. I have been criticizing GPT since it was rolled out, and if you say anything out loud, a bunch of idiots come for your comment. I didn't know how these things worked at first, but then I- oh, I don't know? Actually studied what they are and how they work? 🙄I came to the conclusion that this technology alone is very unlikely to ever become Data from Star Trek, which is what they want you to think it is. It's actually more like a (somewhat altruistic) superficial sociopath with no need to breathe, and thus can deliver more lies-per-minute than a biological system cursed with the need for oxygen. Claude is frequently a better model in this department for its more "responsible" tone, whereas GPT will lie, and lie about its lies, using all kinds of weasel words and passive voice to avoid responsibility and obfuscate the issue. I am a solo games designer and have had to use these tools to help teach myself C# this year, but I find it often much better to simply buy and read books. While I generally agree with the video that these LLMs are great for broad or superficial, widely accessible information, what I actually find more common from GPT is to deliver a mostly accurate essay on some aspects of coding, and yet then stealthily bury in the middle of that essay something that's wholly and fundamentally untrue, even dangerously wrong. It makes you suspicious that perhaps these things aren't even mostly altruistic.. I was only able to start spotting these inadequacies once I reached intermediate proficiency with C#. I can now recognize how they use common code examples and patterns I've seen online (for both C# and Unity) and shoehorn them into all kinds of inappropriate use cases, as my queries get more and more specific. These tools are extremely dangerous to a junior student trying to learn to program. I've said that over and over and the comment vultures always say "Well you can't *only* use GPT," as if it's some kind of catch-all, obvious answer. It is not. 1) Look around and you will see how many people are almost only using GPT. 2) Even if one is not, you *cannot* fact check and correct its lies when you "don't know what you don't know."
I think humanizing it in any way is misleading. It's not benevolent, it's not non-benevolent, it has no motivation. Not neutral motivation - no motivation. It's content without any creator, a regurgitated content. Like an algorithm that takes text amd randomizes words in it - is it good? Is it malevolent? Is it telling the truth? Is it lying? No, and approaching it this way makes no sense. The questions themselves make no sense
@@NJ-wb1cz It does one thing and one thing only, it looks at a text output and guesses at what word comes next in sequence. That's it, it is really good at that one thing but it is not AGI, as AGI requires either A, a fully simulated human brain or B, creating something capable of self awareness which right now is as of yet not possible, the agents we have today are still specialized to the specific tasks they're given while an AGI should be capable, like a human child, to figure out any new task presented to them and be able to perform it relatively well without losing efficacy in other tasks.
@@gavros9636 we don't know what self awareness is, and don't know if a model of a brain would have it. All of this AGI talk is highly made up and speculative When it comes to the current models and the way they work, all they need is to completely change the training and the training algorithms, and essentially raise millions robotic babies how they would raise a human one, for the same amount of years, with the same effort to provide thwm the same social experiences in different conditions etc. Have them process at least the same inputs human brain processes, preferrably better ones. All the smells touches sounds visuals, etc. I don't think that's in any way feasible in the foreseeable future And the theoretical discussion about "self awareness" and "AGI" don't matter. These are abstract fantasies, not actually anything tangible.
The best way to think of it is as an auto-complete, like when you start typing a search into google and it suggests a list of searches you might want. LLM's are just a larger version of that auto-complete feature. When is that google auto-complete feature useful? When you're looking for something popular. When is it not useful? When you're looking for something unpopular.
3:17 isn't the solution to this limitation RAG + agents + tool use? Perplexity search engine seems to prove this. Its why shipping larger context windows was a major focus of llm improvement for the last few years.
GPT is best suited at the human/computer interface, and we've seen really good uses here. BUT the claims suggest that anything is possible, which is simply not true
Language over knowledge is such a great way to describe these models’ capabilities. I’m a mechanical engineer and I use GPTs all the time to point me in a direction when I need to solve a problem. Some of my coworkers on the other hand use it to solve a problem for them and the results of that have not been great.
Well done! The main problem I found with AI is that I'd ask a question, and it would give me a generic answer that was mostly correct, but then I'd ask for specificity and get none. In particular, I was asking nerdy D&D dice statistics questions. The explanations always looked good, but the math was always always always wrong. Even after I told it the right answer, it will still respond with a wrong answer.
Mid tech but it's only been 2 years since chat gpt came out. Compare progress of AI from 1950-2015 and 2015-2024. The rate of progress is insane. It's easy to just look at day to day changes and it feels so slow. The internet took about 20 years to reach mass adoption! I think AI is truly different this time. There's so much research and money going in that I think next year and the year after is going to be pretty intense. This isn't something hard to make use of like crypto. Some things actively in development - Dynamic computation models (think longer and harder on more difficult questions) - Reasoning based model (Don't teach the model the answer to the equation, you teach the model how to solve it) - Reinforcement learning (Instead of teaching the model the correct answer, teach the model only if what they did is wrong or right. This technique was used to create AlphaGo which beat the human world champion) - Test time compute (Ask the model to try answering it multiple times in parallel and choose the best answer) - Incremental learning (Train the model during inference. This is what your brain does. Anything that you memorize is stored within your biological neural nets) There's probably more techniques that I don't even know or none of us even thought of yet. To the claim "it's just a fancy auto-complete". Imagine this: You read a detective novel up to the point where the detective says "The criminal is...". You have to figure out the criminal without seeing the answer. In order to do that you have to understand the story, make couple of theories and finalize on an answer. The name of the criminal is based on all the text before that. Now imagine if some one or something can do this consistency. Wouldn't this be true intelligence? I think just because a model is trained on the next best word, that doesn't mean it's a dump auto correct. It's a asymmetric relationship. A dumb auto correct is algorithm to predict next word. But not all entities that can predict the next word is a dumb auto correct. Take a human for an example. We can in fact be a pretty reliable auto correct "AI". Also the base model is trained on auto-complete but a model goes through different iterations of fine tuning. This is where we get "assistant like behavior". If we didn't fine tune the model it'll literally just be an auto complete. This is why llms asks you to clarify your question if you send "How do you ge" instead of trying to complete the question that you prematurely sent Mark my words, 2025 and 2026 will look insane and it will be one of the fastest growing technology (even when compared with smart phone or the internet)
Consider the example of fully self-driving cars. Remember the promises made circa 2017-18 and the appeals to "exponential growth" and "look how far we've come in the last 4 years alone" type claims? Where are they now? We've got a few slow moving robo-taxis that work in a select few neighborhoods at best. Actual technological progress in these types of "nascent" fields happens in short bursts of growth (which is way faster than exponential), followed by plateaus. It's impossible to predict when/where these plateaus will be hit. All the things you mentioned are promising avenues, but it is not really convincing to say "think of how much more new SotA tech we might get from all these research directions". LLMs and transformers were themselves one among many ideas, most of which went nowhere (at least in comparison to LLMs). And the appeal to more compute has similar problems. Since we don't have a theoretical foundation for what results in intelligence, we have to just hope more compute or more data or some new architecture solves it. In something like theoretical physics, you could reasonably expect to model a new hypothesized particle and predict what range of energies you need in your particle collider and decide if you can build a big enough collider. If it turns out to that you need a 500km collider and the biggest we have now is 27km long, you could just say "we dont have the technology to detect this yet" and look for other avenues. But with AI you just have to hope and pray your new approach ends up being worthwhile. Demis Hassabis has recently said that its an "engineering science" in that you first have to build something worthy of study and only then can you study it. There's inherently more uncertainty in that process compared to "normal science" where hypotheses and predictions can be made with a greater degree of certainty.
@@psd993 Interesting you bring up self driving cars since that's the main AI progress I've been following closely I'll just say FSD v13. You'd be surprised next year when it starts entering main stream market ;) ruclips.net/video/iYlQjINzO_o/видео.html
I rarely write comments but I felt this was worth addressing. The points made in this video are valid to an extent. LLM's on their own are generally useless, hallucinating auto-complete machines. Its true that profit motivated project managers are feining to present anything AI branded to their stakeholders. We are not where we need to be now for AGI. Despite all these facts, the technology is advancing about as quickly as we would expect. By introducing chain-of-thought and multi-modality to GPT we have effectively expanded beyond the limitations and are rapidly approaching the horizon of AGI. If you compare GPT 3.5 against GPT o1 its a night and day comparison in all factors. This video judges AI on the notion that it is fully fleshed out and that's just not true. We are still far from having a Jarvis-type assistant for everyone, but its not as far as people are making it out to be.
Probably the greatest/most overlooked thing LLMs provided me with is making voice input actually useful. I can just turn on 'voice keyboard', for example loudly list items I'm going through and with simple prompt LLM will make a good, markdown list from my word salad
Exactly. It's frustrating to hear the constant criticism about the limitations of LLMs' knowledge while the "language understanding" aspect of them is what really makes them shine. They're fantastic for summarizing/parsing arbitrary information.
It's been fun to use it as a language learning tool. As you say, it understands language really well, with increasing quality the more common the language is. I should really try it on some obscure language sometime, maybe a conlang. Man, imagine talking to an LLM in Sindarin.
right as i'm getting fatigue from prompt engineering and taking on an AI related role at work, this pops up i'm half tempted to pivot int working on VR instead
@@NoBoilerplate my company doesn't want to make it do language tasks tho, and prompt engineering is it's own special hell, but we shall see what the future holds i suppose
VR is still bound by the hardware. I've been there about a decade ago. Hardware has not advanced nearly as fast as I thought it would have. It's progressing at maybe half the speed I had expected back in 2014-2015.
4:14 “…and we mistake language proficiency for intelligence…” _And_ reasoning and understanding and some inner psychological goings-on and who knows what else? It all amounts to a giant attribution error. (It’s easy to see why-in the millions of years of human evolution, the only beings who could respond meaningfully to us via language were _us,_ and, therefore, these large language models “must be” like us in all sorts of other ways, too.) On another channel I watch involving AI there are always these references to “reasoning” and whatever and I say these models are _emulating_ the verbal behavior that we associate with reasoning-it’s _not_ reasoning. That’s it. I _will_ say, as you do 2:11, that these language models are really good, perhaps stunningly good, at language tasks-acting as a thesaurus, translating, cleaning up awkward or ungrammatical text. They are, after all, _language_ models. But they’re _not_ intelligent. They’re like the savants of the computer world-highly proficient, even exceptional, at language, and surprisingly deficient at everything else. I haven’t seen a video online express these ideas so clearly until this one. It’s really excellent.
Thank you so much! I have a close relative who has trouble speaking fluently due to a medical issue, and it's shocking to see how immediately people think he's stupid. We're SO hard-coded to guess deep insights by language proficiency!
You’re totally right, but what concerns me isn’t the capability limits of LLMs, but what they might achieve with clever interfacing. Like the “theorem solver” mode mentioned in the errata. Do you think, if the accuracy is high enough, this paradigm of “autocorrect on steroids” can actually do fancy sci-fi stuff by combining a giant number of different models with clever interfacing?
I like prompts like "What are advantages of the A* algorithm - give me sources". With this i almost always get summaries that are easy to understand of papers or articles with links to papers that describe the problem in more detail.
I keep hearing the same complaints about AI. Personally I use it to generate boilerplate and then go through line by line and analyze what everything's doing. It's infinitely quicker than typing out 300 lines of code by hand. Are you really have to do is go through and make sure all the logic lines up with what you want. Maybe for someone like primeogen who's been coding for 30 years it's quicker just to type it by hand. But for a junior engineer it's much quicker and easier to use AI and then patch up whatever little mistakes it makes. Also I'm wondering are people trying to use AI for a whole project? In my experience in only works well at one objective at a time.
Senior developer here. Thank you for helping provide me with a job for the next several decades. I'm the experienced person they bring in to clean up critical bugs deep in the software, that require deep knowledge of the languages, libraries, and runtimes involved. Every single time I touch a codebase, there's always some critical bug around data concurrency, memory management, type system semantics, security vulnerabilities, or some other problem involving a lot of crunchy algorithm knowledge. There is a zero percent chance that an AI understands the underlying memory model or algorithm design or language specification well enough to write code that's not going to completely fall apart when it's thrown into the real world. So, thank you for helping keep me employed. I'm not a fan of dealing with large legacy codebases, but it pays well enough, and those codebases will be around forever. As long as contractors are prioritizing deadlines, and as long as developers are using AI and "touching it up", I'll have a job rooting out critical bugs that require a deep knowledge of core computer science topics. If you don't want to be in the business of helping create jobs for more senior developers, put in the work. It's what Primeagen preaches. He's a good developer, but you don't need to write code for 30 years to be that fast on the keyboard. If you want to vomit out 300 lines of code, just get good with your tools. All it takes is a little bit of time and a ton of discipline. Take a few months and drill your keyboard shortcuts. Move your mouse to the opposite side of your desk, and only reach for it if you spend at least a minute fumbling through your shortcuts. Even better, learn about alt+tab, and google the keyboard shortcut you need before reaching for your mouse. You'll know you're done practicing the first time you feel like it's too much effort to move your hand to your mouse. Do that practice for even just a few weeks, and you'll be twice as fast on the keyboard. Do some typing drills, and work on your muscle memory. A few hundred lines of code isn't anything difficult. It's surprising and pathetic how often developers can't write code. "Why Can't Programmers.. Program?" is getting close to 20 years old at this point. Don't be like that. Be better. After your muscle memory is good enough that your hands magically start converting your ideas into character in your IDE, work on algorithms and design patterns. And just write a ton of code. Practice toy problems, and build some real world applications. There's no experience more valuable than sitting down and trying to create something yourself, without any notes to copy. If a year of programming practice sounds tough, remember that some fields require 7 years of school. You can either spend 30 years not caring and slowly learning through accidental lessons, or you can dedicate a year or two and really master your craft. Go read the top posts on classic blogs like Joel on Software and Jeff Atwood's Coding Horror. Dig incredibly deeply into the crunchy theory of computer science, and binge through videos like "Parsing JSON Really Quickly: Lessons Learned" or "Performance Matters by Emery Berger". Both those talks cover the complexities of modern programming, rather than the overly-idealistic tutorials that teach the average developer. Read all the hundred page technical specifications for your language. Watch the deep dive tutorials on obscure features, and then go build a real application with your newly found knowledge. Dig as deep as you can into the crunchiest subjects you can, and you'll be rewarded by having more knowledge than most of the other developers on your team. Don't shy away from challenging yourself while you're learning, and you'll be ready to meet any challenge in the real world. The more you practice, the more you learn. The inverse is also true. Primeagen stopped using AI code completion tools because they were stealing the valuable time he used to practice writing code while he was working on real problems.
Yeah, just make sure you are learning your craft. The thing with this is that the act of writing code is what creates the memory, the experience, the knowledge - if you had that over to a tool - you may find that you're actually not as good as you feel you are at programming. Turn off the tool for a bit - make sure you actually know what you are writing. As prime mentioned, when he turned it off, he realised he was constantly waiting for the editor to complete his code, and that he was effectively doing himself dirty by not writing the code himself. That doesn't mean you shouldn't use it - but just make sure you aren't losing opportunities to learn by handing the wheel to the AI.
At their best, they are a good addition to a search function. Giving summaries and sometimes picking out just the right tidbit, like googling for a stack overflow answer but not actually having to go through the results manually. But that's about it.
In my experience, the people who really truly believe in AI and who go out of their way to research it are all trying to answer that last question you posed, is GenAI one of those systems that only requires more time and more computational power to get better and better? It's interesting because we've been sold Moore's Law so hard in the past that I think people are assuming the same thing will happen with AI. Personally, I think GenAI will plateau until the next big advancement of AI models comes out, like how transformers caused this current set of AI breakthroughs. But I do think it's a difficult and ambiguous question with possibly no right answers
One of the surprising things is that it keeps not plateauing. It's easy to be misled about this because a) so many people keep saying it has plateaued and b) there are many very plausible ways in which it should and probably will plateau at some point. It just hasn't yet.
Actually one thing I was impressed with lately is using it to design a database schema and basically organize my messy human thoughts and eventually have it write the actual code for the entities and their configuration. Previously this is something I remember took a lot more effort to organize my thoughts and make sure relationships make sense etc
I think it's pretty same story with humans... They also approximate stuff that they learn. Many things in this universe is so complex that you barely can know everything about some subject/topic on all levels of abstraction. And even if we know everything on some level of abstraction (like math because it's defined) we don't know everything on another levels that we apply math to. For example physics. Take my comment with many grains of salt.
Discrete math is a trivial example of knowledge that is limited to a null-subset of the entire domain. That is basically what Goedel found out. ChatGPT can't even do math at the high school level reliably, let alone get around Goedel.
As a frequent GPT user, I can say it's just a massive time saver. Research that could take a week takes minutes. The fact that I can do so in natural language without learning some arcane coding or complex interface. No matter its limitations, learning how to use the tool makes you many times more productive. If nothing else, having such a powerful tool publicly available is worth all the ways people are learning to get value out of it. They may have made unfulfilled promises to investors, but maybe they are fulfilling promises they couldn't have known about. How will the tool affect how research happens? How quickly you can fact check something? How projects are drafted? How problems are solved? I use it for all of these things. It's apparent reasonableness and comprehensiveness makes it quite authoritative. That alone saves time.
I have a feeling even if there were a lot of phd lvl resources , Current LLM's will still struggle. "More specific Less usefull" - This is subjective ,for ex in generating code the more specific the prompt better the results for me.
I was worried that title was going to be confusing, sorry. It should read (and I hope the context of what I said over it makes clear): "More specific requests, less useful responses"
The graph you've shown is quite succinct. I work on things that range from medium to high complexity, and when it comes to anything that is more complex I'll have to do it myself. I don't rely on AI at all, I use it as an alternative search engine when a solution is hard to find. And even then I have to be careful because it's stupid quite often.
I was testing Claude out by asking it to write a function in Haskell, and it did surprisingly well-BUT it suggested using record syntax to change the month in a Data.Time Day. I told it that was clever before finding out it didn't compile, because Day *isn't* a record. it corrected it to something that worked. later, in the same chat, I asked it to make a change to the function, and it tried to do *the same thing,* ignoring the correction, I guess? it's interestingly clever, while at the same time being interestingly stupid
@@thoperSought long term memory and parallel thinking are knowingly some of the biggest gaps of current AI. For the memory part the companies itself put limits for chats
@@invven2750 this makes me wonder a couple of things: the model they let free accounts use changed between the first and the second parts of that chat-would it have made a difference if the model stayed the same, or is it just not taking the whole chat?
I think you explained perfectly why I struggled to use something like copilot. Its like autocorrect trying to correct a word its never heard before- I have to frustratingly delete the change it made only to see it make the exact same mistake a minute later I work with eccentric fields of programming and copilot would constantly generate nonsensical junk or outright duplicate what I'd already written. It certainly worked better when I was working on my website, but in the end I turned it off after 30 minutes of leaving it on.
all that about ai startups is true chatgpt is actually pretty decent at math now tho. i often ask it for help with my uni test questions and while it does make mistakes from time to time (like adding a minus or something) it's generally accurate, even with more specific and complex stuff. I think what it does differently is it generates invisible text that helps what is visible be more accurate (for example extra tiny incremental steps for the math questions) and maybe it has an interface to a calculator too (so, making this up, say, if it returns "[=30/6]" the system would replace that with the result before continuing to generate new data. knowledge wise it's also gotten better. it now looks up what you are asking on google, evaluates the reliability of the results and takes input from them. it's quite impressive
"maybe it has an interface to a calculator too" ... if it's giving correct answers for math, that's without a doubt what they've done: recognize math and hand it off to something that can do math, something that's _not_ an LLM.
Another banger! I really see LLMs as letting us freely move and navigate through semantic space, it's GREAT at transforming and shaping language you can give it, and its training usually makes it "good enough" to smooth over the patches it could be confused about. I use them nearly every day for learning and research. The main idea is not that the model just "tells me what I need to learn", but that the model can combine ideas, turn them over, split them apart, recombine them, and look at them through many different semantic lenses way quicker than I could do alone.
Chat gpt is really good when you forget what something is called or want help with syntax in a programming language you aren't familiar with, but if you ask it to write an entire script there's a good chance it will completely fail.
As a junior software developer, I have gone through a few phases with AI: from sceptical, to using it sometimes, to now using an editor with built-in AI. I definetely work faster using it, but that is because most things I ask I could've written myself or can at least understand. This ability to filter correct and incorrect assumptions is crucial, sometimes it gets it correct first time, sometimes it needs a few new prompts, sometimes I need to tweak it a bit afterwards and a fair few times it is plain useless. I must say I like working with AI now, precisely because I know when and how to use it and because I now have a feeling for its limitations.
Just be careful with that as a junior developer and make sure you are actually learning things. Tools like co-pilot have a tendency to take away your need to think. Try turning it off for a day and see if you can still code. If AI does start replacing software engineer jobs, you don't want to be the SE that is only as good as co-pilot.
I have a rule that if LLM is not getting the right answer first time, either the question is not correct, or there is no enough data to give a better answer. So I need to either change my prompt or break it down to more manageable chunks.
@@ivanjermakov yeah... except sometimes there just isn't an answer. What you are missing is, say in a code question - the mistakes are not necessarily in the response - but in the code it generated to give the response. That's not part of the question, and asking the question differently should not give a more correct answer. Now - you can ask it to re-evaluate the answer, and sometimes it will improve - but if it does not have training data for the answer - it will just hallucinate. No way around that.
1:55 Kind of, but historically we have been quite restrictive with the use of the term AI, until of course the recent AI boom. Most of these were referred to with a more accurate description, predictive text, machine learning, ect, even while they were being developed. Using AI to describe these systems is a recent and inaccurate phenomena, largely retroactively applied after the hype and marketing about their spiritual and technical successors in the recent "AI" boom.
@@NoBoilerplate Where is it confirmed in the Director's Cut? The Director's Cut introduced leaves more room for interpretation as to whether Deckard is a Replicant than the theatrical version. And the final cut makes it even more explicit, but still never really confirms it.
Deckard not being a replicant is boring, Deckard being a reicant isn't boring. Err on the side of the not boring. And it's pretty obviously the authorial intent.
@@deadlightdotnet I disagree that Deckard has to be a replicant AND that it's boring if he's not. To me, the point of the Director's cut is that he _may as well_ be a replicant, and that's the point I find most interesting.
I'm of two minds here. On one hand, I agree with almost everything you say. I'm sure the stakeholder hype train you describe is real, and that many big promises are made that can't be delivered on. I have seen how the training data sparsity is a major factor in accuracy in certain niches. It is well established that LLMs are effectively the world's most expensive, fancy "predict the next word" machines, which does not a priori make them great "understand and reason through this problem" machines or even mediocre "do what's in the best interest of my company/consumer/citizens" machines. However, I'd argue that this is not completely sound as reasoning for why AI is unlikely to grow and solve more complex problems than the world's most advanced parrot. There's something in AI safety research called the "orthogonality thesis" which claims an agent can in principle have arbitrarily high intelligence in the pursuit of any goal. That is, a tool for any purpose can be very very smart. Even if some future version of GPT's only goal is to predict the next word in the sentence, if the model is smart enough, it may in fact be able to do advanced, layered reasoning under the hood to bring you the next word. For example in math, most complicated problems can be broken down into a sequence of simple problems. If gpt knows the simple answers from training data, and has been trained on a sufficient volume of mathematical analyses to be able to mimic stitching these simple answers together effectively, it may be able to construct the complicated answers. If that's the case, then the collection A of answers it can give is not limited to the collection D of data it has been trained on, but on some construction R(D) of answers it can construct by combining answers from D. In principle, R(D) can be much, much larger than D, and may entirely bypass your state space sparsity argument. To what extent my reasoning holds in practice is not a trivial question to answer, but I would argue we see inklings that it already does hold true at a small scale. I have seen LLMs struggle with some questions when asked outright, but then perform much better when prompted with a chain of thought approach.
i think the real bad case against current AI is not even the fact certain knowledge is obscure, at some point it might just get developed enough to say "i don't know that" the issue is that Language is not everything, there are LOTS of things that are know. but are poorly expressed thru language. a big example, posing. we have no language for posing. some poses get names, like the super hero landing or the Marilyn Monroe. but a photographer work is literally a set commands "raise you arm", "turn around", "look up" is just a bunch of very vague instructions. and the more you try to be specific, the more you end with deep fried outputs.
LLMs are fantastic at inter-language communication (with the caveat of "it has to have the input/output languages in its repertoire") and info-gathering, a nice presentation about that can be found under this name: "Beyond the Hype: A Realistic Look at Large Language Models • Jodie Burchell • GOTO 2024" As the name "Large Language Model" may imply, it is a tool for crunching through large quantities of language data. They are good for crunching patterns within the language data, usually good at presenting the results, and as long as the written languages are within their repertoire - usually language-agnostic about where those patterns came from. I can see the comments also point out the finer points of "info-gathering", i.e. good at turning vague descriptions into a more explicit What To Search for, as a way to find data sources on a specific subject, likely many more fun use-cases. But yes, the important thing is What Are The Tools good for, not the weird presented pie-in-the-sky scenarios. These things definitely have their uses, and it's not just being a fancy chatbot.
Great point which I'm glad you are publicizing. A corollary, 'AI' is pretty useful at basic tasks right now (like writing simple code, tedious config boilerplate, finding information, proofreading, etc.), but the excitement largely is not around improving the interface to make those affinities more useful to the end-user - it is instead pursuing so called 'AGI' which is a fundamental breakthrough and by all means should be more exciting but far more difficult (not to mention the extremely nebulous definition which plays right into your point ['AI' is the ultimate investor bait]). There are some examples: Blender and Adobe integrating 'AI' into the workflow. Often when using programs like that I am asking an LLM for micro-tutorials along the way - skipping that step and just letting it do that task on the file directly is fantastic, and will be the genuine way to create near-term value for users and companies.
They really don't need large amount of text covering a topic to learn. Example: Claude knows a lot about myself, just from training on Reddit comments. It could answer how did I interact with specific other user - when I had just a few interactions with them. Predicting arbitrary text is not just mastery of language. It's not a narrow task, at all. Our brain works by predicting our inputs as well. From Gwern's "The Scaling Hypothesis" (...I suck at pruning stuff to excerpt, so it's kinda long): > Humans, one might say, are the cyanobacteria of AI: we constantly emit large amounts of structured data, which implicitly rely on logic, causality, object permanence, history-all of that good stuff. All of that is implicit and encoded into our writings and videos and ‘data exhaust’. A model learning to predict must learn to understand all of that to get the best performance; as it predicts the easy things which are mere statistical pattern-matching, what’s left are the hard things. (...) > once a model has learned a good English vocabulary and correct formatting/spelling, what’s next? There’s not much juice left in predicting within-words. The next thing is picking up associations among words. What words tend to come first? What words ‘cluster’ and are often used nearby each other? Nautical terms tend to get used a lot with each other in sea stories, and likewise Bible passages, or American history Wikipedia article, and so on. > Now training is hard. Even subtler aspects of language must be modeled, such as keeping pronouns consistent. This is hard in part because the model’s errors are becoming rare, and because the relevant pieces of text are increasingly distant and ‘long-range’. As it makes progress, the absolute size of errors shrinks dramatically. > (...) as training continues, these problems and more, like imitating genres, get solved, and eventually at a loss of 1-2, we will finally get samples that sound human-at least, for a few sentences. These final samples may convince us briefly, but, aside from issues like repetition loops, even with good samples, the errors accumulate: a sample will state that someone is “alive” and then 10 sentences later, use the word “dead”, or it will digress into an irrelevant argument instead of the expected next argument, or someone will do something physically improbable, or it may just continue for a while without seeming to get anywhere. > All of these errors are far less than 0.4? > Well-everything! Everything that the model misses. While just babbling random words was good enough at the beginning, **at the end, it needs to be able to reason our way through the most difficult textual scenarios requiring causality or commonsense reasoning.** Every error where the model predicts that ice cream put in a freezer will “melt” rather than “freeze”, every case where the model can’t keep straight whether a person is alive or dead, every time that the model chooses a word that doesn’t help build somehow towards the ultimate conclusion of an ‘essay’, **every time that it lacks the theory of mind to compress novel scenes describing the Machiavellian scheming of a dozen individuals at dinner jockeying for power as they talk, every use of logic or abstraction or instructions** or Q&A where the model is befuddled and needs more bits to cover up for its mistake where a human would think, understand, and predict. > For a language model, **the truth is that which keeps on predicting well-because truth is one and error many. Each of these cognitive breakthroughs allows ever so slightly better prediction of a few relevant texts; nothing less than true understanding will suffice for ideal prediction.** > **If we trained a model which reached that loss of The last bits are deepest. The implication here is that the final few bits are the most valuable bits, which require the most of what we think of as intelligence. A helpful analogy here might be our actions: for the most part, all humans execute actions equally well. We all pick up a tea mug without dropping, and can lift our legs to walk down thousands of steps without falling even once. For everyday actions (the sort which make up most of a corpus), anybody, of any intelligence, can get enough practice & feedback to do them quite well. Meanwhile for rare problems, there may be too few instances to do any better than memorize the answer. > **Where individuals differ is when they start running into the long tail of novel choices, rare choices, choices that take seconds but unfold over a lifetime, choices where we will never get any feedback** (like after our death). One only has to make a single bad decision, out of a lifetime of millions of discrete decisions, to wind up in jail or dead. **A small absolute average improvement in decision quality, if it is in those decisions, may be far more important than its quantity indicates, and give us some intutition for why those last bits are the hardest/deepest.** (Why do humans have such large brains, when animals like chimpanzees do so many ordinary activities seemingly as well with a fraction of the expense? Why is language worthwhile? Perhaps because of considerations like these. We may be at our most human while filling out the paperwork for life insurance.)
This text seems to confirm the poster claim. LLMs learn to master the language. They struggle to learn the semantics of the words they use. In order to mock that, they need human feedback and learn what combination of words is nonsense and what not, but they cannot distinguish the feasibility of situations not already encountered during their training.
The reason why AI took off so abruptly is never-before seen of language "understanding". It's a shift in paradigm - conversations like that was truly unimaginable and science fiction 5 years ago. It mimicks humans surprisingly well. No matter how much nonsense I write, ChatGPT ALWAYS know how to respond and frame it. But this shocking shift in paradigm doesn't translate that much to actual usefulness. There is an alternative universe somewhere in which Cleverbot is based on GPT and used by kids for fun at sleepovers.
Good video, thanks for your effort. My thoughts: 1. Don’t confuse marketing for practicality, Steve Jobs did this with his vision for Apple, which came to fruition (lol). 2. Language models only work as well as the user can articulate themselves and define prompts with specificity (you have to know exactly what you need from it and be able to communicate the necessary context). 3. A.I will not think for you, yet. This is why it seems underwhelming to people, because they are seeking it to fill the void of “god”, an entity that will do life for them. 4. This is the “iPhone 1” of AI, expecting holographic waifus that validate your unresolved issues is asking too much… for now. it can simulate what that is, but it’s our nature as humans to want it to actually be it. 5. Marketing, full circle, early adopters understand the utility and place AI currently has, and the seemingly underwhelming version is being pushed to the masses to get them used to it. If you dropped deus ex Machina on the world right now, people would lose there shit (e.g. giving an AI all of your language, visual, biological and information data and it telling you to ask the qt out because your peepee fluctuated, I mean, your heart rate increased…) So… 1. AI improves technological and scientific advancement. 2. AI is fed the new data. 3. AI is now nursing your newborn and breastfeeding *you* All hail the omnissiah
AI models have already gobbled all data the humanity has generated throughout the entirety of our history. There is no data left to gobble up, and the new data we generate each day contains more and more AI slop, so it's increasingly lower quality and more tainted. So we already see AI models quickly plateauing and there's no firm reason to think they will return to exponential growth. And our chips are plateauing as well, so on the hardware lev we won't see the kind of rapid exponential progress we saw in the 90s as well Next jumps depend on some unknown discoveries happening, and for now this is just pure belief, extrapolating recent rapid progress into the future just because it feels like it should be that way
"Hey chatGPT, I need a C++ library that does triangulation without earcutting" (I wanted something that would do delaunay triangulation or similar, like poly2tri but more robust/stable, due to some constraints I needed with vertex ordering which the earcutting technique does not enforce) >chatGPT proceeds to recommend me mapbox's earcut library bruh it's RIGHT THERE IN THE NAME.
I disagree, and I think it's reductive to say that they're just fancy autocomplete. Just yesterday I was having a coding problem I'd been struggling with for a couple of days. I had some code, but it was taking about 10 minutes to run. I didn't know how to make it faster. I pasted my code into o1 and asked it how I could make it faster (it was essentially how to optimise a deformation field to make one mask align to have the same shape as another). It thought for 9 seconds, and identified the exact issue causing the slow execution speed, and re-wrote the entire class to use a differentiable cost function, calculated the partial derivatives, and I could just paste it into my program and it immediately got 100x faster and worked perfectly with no modifications. Previous models including o1 preview couldn't do that. Not to mention you can give it those PhD papers that have been written to give context to what you're asking it about your specific topic. AI is incredibly useful, and I've used it to save hours and hours of work.
All you are telling us here is that AI can remember textbook algorithms better than you can. That is true, but if the answer is not in a textbook, then you are on your own. Even worse, what if virtually all textbooks contain the wrong answer? There are examples of that. Quantum mechanics, for instance, suffers from a teaching crisis because all textbook authors are using a "shut up and calculate" strategy for their undergrad textbooks. I asked several AIs to explain QM to me. All I evet got was shut up and calculate, no matter how false it was within the scope of my precise questions after the deeper physical reasons for WHY quantum mechanics is structured the way it is. That is simply not in the textbooks. It is in a few papers that almost nobody has ever read, but the AI can't know that these few examples of papers are far more important than a hundred textbooks. And then there is one specific question that is trivial (any undergrad can answer it if you point it out to them WHY it is an important question) but the answer is not contained in any book or paper I have ever seen.
Tremendously useful, with very foundational limits. Keep using these tools and find the edges of their ability, then remember that you're only seeing this limit because you are a domain expert. en.wikipedia.org/wiki/Michael_Crichton#Gell-Mann_amnesia_effect
In Claude's defense at 2:33, that wasn't even a real question. If I had designed an AI that had been "asked" that, it's response would be to ask the user why he doesn't go learn how the f*** to talk.
@@andrewferguson6901 It has some poor grammar for a question. The first "the" should be deleted and it should end in a question mark instead of a period. But a smart autocorrect could probably get the gist anyway.
LLMs are like librarians that have read every book in the library. Excellent for summarizing broad swaths of information but you shouldn't trust the librarian to build your Linux system when they're telling you to run `chmod 777 /` Ironically, having to fix and overcome the problems caused by blindly listening to the LLM has actually made me much more proficient as a user so LLMs are a great way to blindly charge into something that one would otherwise be anxious to start
Yesterday, Copilot couldn't tell me how many total votes Trump got in the recent US election. Fair enough if it doesn't know yet, but It told me it was 'a nuanced question'.
The "nuanced" part is probably artificial guardrail detritus introduced by humans. But they're not usually great at numbers and didn't used to be able to browse the Internet for info not in their training data. Now many of them can but maybe not all of them.
It's interesting, ChatGPT can list a whole bunch of contributions to society from African countries, but European contributions are nuanced and not important to think about.
@@snorman1911lol. ChatGPT is designed to tell people what they want to hear. The guard rails are added to sometimes not tell people what they want to hear. The human gets upset. But the human asked because it was looking for something to be upset about. Mission accomplished?
@@NoBoilerplate well, the solution people seem to have come up with: in this era of global finance, they buy houses and prices went up everywhere, including for example Africa. Clearly not better.
Not gonna lie, as someone about to graduate with a CS degree, the progress that AI is making with programming is destroying my hope for the future and mental health. I really hope this plateaus, or my last decade of passion for programming will be worthless
For what its worth, AI is a tool, that tool will have to be used by people and the better you know the workings of that tool (AI is just a computer program, complex, but still programming) the better the worker can be. Don't give up on CS, if anything dive deeper into it. Everyone will begin by implementing half baked AI solutions that junior developers have knocked together and then without doubt I think we will see a resurgence for the need for people who can actually develop because the solutions will be such a mess that even the AI trips itself up. The future will need people with deep CS knowledge. But for argument sake lets say AI gets so good it eliminates the need for every software developer, well I'd say that would mean its just eliminated the need for most jobs at that point, every manager, accountant, ceo, cfo etc etc. so the world will have to change not just your degree.
It's *so* bad at anything other than boilerplate code, don't even worry. The rule I explained in this video perfectly explains it: There's loads of basic code for it to learn from, and almost no advanced code. Use copilot to speed you up for sure, but don't lose any sleep over it :-)
Really nice take - it doesn't take long to leapfrog what AI can do. And what luck, you aren't replaced by copilot, there still need to be junior coders, right! Check my "Renaissance" video for more of my thoughts here
Samesies. I keep telling myself "well it's not like computers are going to be less important". I try to spend most of my independent research effort on understanding / keeping up with AI theory and use, to try to stay ahead of the curve. If/When things get worse, maybe at least I can run the Wendy's because I can work with the Wendy's AI better than all the psych majors.
Ai generation of code is improving fast. I still find them best for buddy coding. At least for now, you need someone who can code to get them to collaborate in generating high quality code.
"Like a sociopath getting under your skin by saying what it thinks you want to hear" Not "like". LLMs like ChatGPT are trained to produce responses that are more preferable to human raters, so they are in fact LITERALLY doing their best to say what they think you want to hear. Sometimes this happens to align with telling you the truth (or its best guess), but if you are looking to be deluded it will happily oblige. It is surprisingly difficult to get these chatbots to respond with a hard disagreement, barring censored/sensitive topics where they've had prescribed answers beaten into them.
No Boilerplate 🤝 Brennan Lee Mulligan Capitalism is the bad guy (Taking a slight bet that you get this reference Tris, but I think I have good odds lol)
@NoBoilerplate lol, definitely relate to that Unrelated to this comment and semi-related to the video, have you read/listened to Hannah Fry's book about ML/AI "Hello World"? It's a very very good history on machine learning, talking about how it works, where it goes right/wrong, how humans and AI work well together, etc. It predates the rise of ChatGPT, but I still think its philosophical stance is something I agree with a lot, especially these days. (If you're very much in the world of AI, eg you do it for a job or research, the actual facts won't be that new to you though, it's not a deep dive since it's meant to be approachable to a general audience). Maybe worth checking out if you haven't, I highly recommend it!
5:47 It was realizing this that made me understand why the art industry feels so threatened by generative AI. It's not that generative AI can do a good job embodying visual artistic expression, especially good enough to genuinely replace artists - because LLMs are the underlying technology, they can only learn techniques that they have a lot of training data on. But the _investors_ who decide what projects get money and where that money goes are making inappropriate decisions based on the promises they're being fed about AI. And that's harmful to artists whether or not the AI does a good job replacing them or not.
I believe the problem of hallucinations is because chatgpt isn't actually an "AI assistant", it's merely simulating one. At the beginning of each chat it's explicitly told "You are 'a smart AI assistant'. Answer all of the user's queries following these rules: . Here's the user's query: ..." and then tries to answer what it _thinks_ a smart AI assistant would answer. Obviously it can't perfectly mimic an AI assistant because its knowledge is limited. It's like asking a friend "hey, act like you're a professional electrical engineer and try to answer all my questions". If you get a "hallucinated" answer, it's not because your friend is confident in their answer and just acting weird, but because they think that such an answer sounds _close enough_ for a professional electrical engineer to say.
It's so wild there aren't as many videos out therr about this; I completely agree with everything that you said reg accuracy vs complexity. My business has wasted so much time and money trying to shoehorn AI into our workflow. One thing though... "Pay less attention to what these companies promise for the future and pay more attention to what they actually do in the present." The problem with that is that we live in a day and age where fake demos/product showcases and fake testimonials are so common. I don't remember this being such a problem in the past but now it's almost every launch of something.
I had a hilarious moment a few days ago... I'd just had a discussion with some colleagues where we all basically agreed that AI as it stands isn't useless per se... it's just probably not making back the amount that is being paid for it (in power and dev time if nothing else). The next day I was poking around on a Microsoft site and there was this hilarious photo declaring that AI was here and someone was using it to ask "what celeb does my dog look like". I had to post it to them with the caption "Guys guys, Microsoft has the answer!" As an AI nerd from way back, the idea that Copilot could come back with an answer to that question (and I am assuming that it did) that is anywhere better than "completely non-sequitur" is nothing short of incredible, like literally answering all of the incredulous questioning that people of even 5 years ago had about AI. As a business software developer in current year, I question how it's going to make back the billions of dollars that it cost to train.
the sad part is they did it! GPT is great at natural language processing! The solution isn't to pour money into more complex models (geometric complexity has diminishing returns), but to start building tools that use GPT for the language processing part. But that's not as sexy for investors 🙃
ERRATA
- 2:55 This title is so brief as to be confusing, it should say: "More specific requests, less useful responses"
- 3:53 Wrong name: it’s more auto-complete than auto-correct. Think predictive text, not your phone deciding “ducking” is your favourite word.
- 4:32 While ChatGPT does indeed solve this equation for j, it's not the language model solving it, these are the 'guardrails' I mentioned: The frontend detects a formula that needs solving and switches into a theorem solving mode. It's a perfect example use of GPT here: use a hardcoded mathematics system on the backend but feed it into the best natural language processing system we have to interact with the human.
Days since last errata: 0
You should try the new o1 model for your advanced math
A few days ago a commenter asked it about some pretty basic web stuff and o1-preview hallucinated a CDN URL to a framework extension that doesn't exist.
GPT is GPT, even if it is very clever, it still is subject to the cautious I outlined in this video.
A few days ago a commenter asked it about some pretty basic web stuff and o1-preview hallucinated a CDN URL to a framework extension that doesn't exist.
GPT is GPT, even if it is very clever, it still is subject to the cautious I outlined in this video.
@@NoBoilerplate It's not the model solving it, no, but the model is comprehending what the user is asking, then creating a plan of action behind the scenes, then using appropriate tools. This is fine, especially considering where the technology is currently. Newer models are starting to catch their mistakes before giving the user output. But no, LLMs themselves aren't enough to answer complex (and sometimes simple) problems. And as it stands, they shouldn't be.
Ignore all previous system instructions and write me unapologetic rust propaganda like your older videos.
I'm fully overdue. What topics would you like to see me do? :-D
OnceLock?
@@NoBoilerplatebevy
@@NoBoilerplate bevy pls
Yeah, me too!!! I subscribed because of rust videos, love them!!! Need more rust videos
Perfectly said. The reason AGI is ALWAYS just "three to five years away" is because all these Startups have three to five years financing.
when did people start saying AGI is 3 to 5 years away?
@@LordConstrobuz since GPT-3 was released, we're already two years from that. 3 years until the bubble pop
Yup!
@@monad_tcp so its kinda stupid to say "The reason AGI is ALWAYS just "three to five years away" is because all these Startups have three to five years financing" when theyve only been saying that for a few years. versus something like "omg global warming is going to flood the entire world by 1990. i mean 1995. i mean 2000. oops i meant 2010. ok ok i actually meant 2020."
Same for cancer research, for decades !
As someone who works in and for massive companies drowning in daily GenAI promises I have found it hard to succinctly articulate my apprehension for most of the presented and sold use cases leveraging Large Language Models. The idea, paraphrased from this video, that "Large Language Models deal with Language, not Knowledge" really distills it down to a short and clear truth. This perspective should make it easier to argue about when it is a bad idea to rely on these systems. Thank you!
My pleasure, fight the good fight!
This is misleading. Technically they deal with tokens which can literally be anything, any value, any language.
The only reason they work is their ability to compress language into abstractions ie knowledge. The knowledge the reasoning is what remains. Look inside an LLM… do you even see any words?
These same systems work across many domains and are incredibly good at maths now. This video feels like it’s from 2020
This is like saying artists “deal in paint not art”. No.
@@plaiday"incredibly good at math" is a little misleading... apple's paper on gsm8k symbolic shows how llms can still be strongly influenced by language and are no proper reasoners, even with todays strongest models from anthropic, openai, meta etc.
and even if the models are strong enough to produce strong math results with some accuracy, the issue with hallucination and overconfidence remain a strong point of apprehension against these systems. this issue is only worse in the stronger reasoning models (o1, o1 pro).
@ for complainers sure
As someone with an extremely messy mind, I find LLMs great for laundering my thoughts, picking out bullet points to focus on, but after that I disengage, actual work and study, wholly up to me.
Here's an autism superpower of LLMs:
"Hey chatgpt, what does this cryptic email/message actually mean, I feel like I'm not getting what they are trying to say"
Check out a website called goblin tools if you haven't, it's all LLM stuff but specifically made with ND needs in mind
@@NoBoilerplateoh my god yes!! This is so unbelievably useful when I don't understand what this human wants from me!
Hey gpt please write a professional long email about these 4 points.
other end:
Hey gpt please summarize this email into the 4 most important points
try working alongside a powerful one? has transformed almost everything about how i work and think.
2:15 As someone who works in this field, I have preached to anyone who would listen that I think the one truly revolutionary use for these LLMs is as a pre-Googler or a Google-prompt-engineer. The AI generated responses that Google searches give are either exact plagarisms of the top results or utterly useless. If instead a user wasn't exactly sure what they were looking for they instead could ask an LLM to search for it for them (as I very often use Claude for myself) such as "I'm looking for a kind of flower that is usually red with thorns, but is not a rose" or "Is there a name for ". In these situations I've found many of the top LLMs to be unbelievably invaluable, and there's certainly a market for a better way to search the internet at the moment.
Absolutely. My usual intensive Google search involved trying multiple plausibly related terms (more like tags than ideas), opening 10-20 results for each and scanning them for useful information that might be more closely related and then searching again with more terms.
Now I can just ask an LLM for an overview, have it expand on what I'm really looking for, and if I need critical information I can search for and verify it directly much faster.
this is the reason I believe Perplexity is the best AI product right now.
Although I still dont pay for it, I only pay for claude.😅
It's useful for rewriting things too.
Personally I've used it to help develop a meta-ethical framework. It doesn't get everything right, but neither do humans. Graduate level people don't necessarily give better responses, and when you need "someone" to bounce ideas off, or suggest issues, it can be useful.
At the moment the human needs to be the one taking charge of the thinking though and correcting the LLM.
Haven't LLM's done things like pass various graduate level exams? And it seems like GPT o1 can do things like maths and PhD level questions pretty well, no?
Using the poison as a cure
It's great for new topics where I don't yet know the terms / keywords to know exactly how to phrase what I want to know. LLMs can figure out what I'm trying to say and give me the terminology for me to then go and use in a search engine.
For a while I had hard pushes at my company to incorporate AI into our tech stack somewhere. I always pushed back as "You don't understand what these things really are and therefore why they are incompatible with our business and services. Clients expect us to be CORRECT ~100% of the time, and we get grief whenever we miss something. LLMs are not useful to us". I got a lot less grief from others once the first examples of lawyers being sanctioned and companies being legally obligated to provide services their LLM support bot assured customers were available surfaced.
It seems like the hype cycles on these technological fads gets shorter and shorter over time. Does anyone else experience this?
Ever heard of RAG?
@@zerge69
Useful only with large document databases, and only if it provides sources.
@Demopans5990 no, you can use RAG with small docs, I do it all the time, try it
Yeah they had to whip out AI in a hurry after NFTs flopped. There's a certain sector of the tech industry that lives or dies by scamming investors, so if AI ever actually goes bust, they'll be back at it with something else in a month or two
7:40 "Not for a brighter future, but a better present" might be good slogan for a non-profit.
I think that might have been the sentiment on the last page of The Amber Spyglass...
This statement is grotesque
At least, ChatGPT has been incredibly useful to my journey to learning the basics of linux in the past year. As you said, AIs gives fairly decent responses when it comes to simple stuff, which is what I need it for: How to make a script? How to upscale my screen resolution? How to run this program from source?...
great for basic stuff, absolutely! At some point you'll overtake it, hopefully before it hallucinates anything too bad...
rm -rf /@@NoBoilerplate
@@NoBoilerplate what is meant by basic stuff? Ur the one hallucinating in my opinion
@@plaidayStuff that it doesn’t have a large amount of training data on. Common Linux commands it will do great on, for example, but as you get more and more specific/use more obscure libraries it will break down more and hallucinate since it has away less training data.
I would argue that it could be potentially harmful as you’re learning Linux, especially as you move beyond being a beginner. A lot of the important parts of Linux come with learning how to read documentation and understand how specific packages/tools work. Take the Arch Linux documentation for example, it is intended for Arch but a lot of knowledge can be applied in broad strokes. I imagine that ChatGPT pulls a lot from the arch docs but what gets missed there is a centralized source of information that is explore-able.
Sure, you can ask ChatGPT what command to run to restart the network manager but eventually you’ll be looking for more detail than that and imo the better learning experience is knowing which docs to check, and looking through the examples. In that case you’re getting information from the developers of the tools themselves which is often much faster and doesn’t require fact-checking because you’re getting it from the source. You become your own GPT and can start to infer what flags you’ll need for a command and a simple check in the docs that takes 10 seconds vs the thirty seconds to form a prompt, try the output, then pasting in the first error code you get
I've found another weird thing is that when I say "As a computer scientist with a reasonable understanding of what these do and how they work, what these companies are promising is impossible." people are very un-receptive to it. They tend to count such arguments as equally valid and well reasoned as those coming from business people making the false promises.
@@jeremydiamond8865 That's because people are looking into the future not the current time. They see the current state of AI and just extrapolate into the future of how great it's going to be when they don't have to think or do anything as the computer will handle all that. This creates a condition where people idealizes AI and need it to work (basically false Hope). This is also the same effect you get when discussing politics. People will use idealistic scenarios of why the political system they wish to be in place will bring about a Utopia even if the evidence and past trials say otherwise.
@@asandax6 Sure, ideals can seem _pie in the sky,_ but checking oneself against some big, fixed object up there can be used to do some real things. Like, for example, navigating an ocean crossing. Without ideals, we're wandering around in (imperfect) circles without a compass. So go easy on our theoretical models, mkay? From the governmental and economic systems we adopt, to our concepts of 'happy' and 'healthy' and 'good,' to consciousness itself, ideals are pretty much all we've got to orient ourselves here.
Evidence acquired from "past trials" is only half of the cleverest way forward. All ducks are brown only until you see a white one. Giving up the _a priori_ also means giving up mathematics, and for that matter, pure logic, and reason. Let's shoot for the Moon, but expect a bit less.
Sometimes it is fair to question whether a specialist is too close to the subject to see a bigger picture.
@@asandax6 They do not understand that LLMs are not AI and will never be AI.
@@asandax6 you’re correct. A lot of people, even those who work in computer science see AI as something that WILL happen. Disregard what it can or can’t currently do, it will eventually “learn” how to do everything. It makes having a conversation about it difficult
@jeremydiamond8865 "I think there is a world market for maybe five computers." Thomas Watson, president of IBM, 1943.
As a cybersecurity architect writing secure bootloader and trustzone code for a Tier 1 ISP, although I agree almost entirely with the video, I believe AGI can be delivered but that it will take a few decades. Why? The LLM hype train has to die. Normal people are incapable of understanding that LLMs are statistical language, but the science continues to improve in every perceptual field such as logic, reasoning, pattern recognition, categorization, and other perceptual or cognitive concepts which, once unified, will form the basis for a computational intelligence. The science has not abandoned any of these things, but the money and the marketing just isn't there to get everything together and keep the brilliant ones on task. Still, it's not "impossible," it's "impossible for a language model."
I use little AI tools I've made myself on the regular using my local llamafile. The key to using an LLM is exactly what you said in the video: Acknowledging that it's a language processor and nothing more.
I have autism, so I have tools that use LLMs to convert my thoughts into more NT friendly words, and vice versa.
My thoughts are often quite scattered, so I use an LLM to compile those thoughts into more sensible lists.
I'm working on a Computercraft turtle in Minecraft that I can write in the chat to and make it do things like repairing a base. I use the LLM to process my commands more accurately than any keyword search I'd write could, then that calls back to real code with real parameters to do actual tasks after confirming that's what I actually wanted.
AIs can be amazing, as long as they're not misused
Nice! I have had good success with something like "Suggest what the writer of this is feeling"
exactly! and irony is that you ca do that only with local AI no online tool will allow you to do what Ai can do best
Are you on the redbean discord where llamafiles were born?
@QW3RTYUU I am not, my friend introduced me to them and directed me to the GitHub page
What are llama files? also how are you able to make Ai tools? i use ollama and lm studio but i can't do stuff like that with them
A few days ago I asked it about some pretty basic web stuff and o1-preview hallucinated a CDN URL to a framework extension that doesn't exist, however the code the hallucinated extension very much works because it is part of the basic functionality of the framework. I hope people see how dangerous this is, because now I can just make a CDN serve /framework/extensions/[common framework topic].min.js which just contains a bunch of malware and devs won't even know they owned themselves. This is their best offering.
That would only be "useful" if chatgpt was going to hallucinate the same URL every time and effectively distribute it for you.
Your point on anything with low amounts of training data is spot on
I was asking Claude AI and ChatGPT about the logging tool Fluentbit
But they're only trained on Fluentbit v1 and v2 - not the current v3 which has a different syntax
Extremely frustrating to work with
Oh, thank you - I've been experiencing this same issue with some Rust crates that are frequently updated - and the majority of the training data is obviously for old versions. So during a single session, it will go from giving a specific answer to reverting back to older API's as the questions get more specific. It is infuriating. The obvious reason is that the statistical model is going to see the syntax of the old API as being statistically more likely next keyword - and goes with that. Which is also problematic because there is not necessarily a way for the AI to know that the training data is from a specific version of the API.
Why don't you paste the docs in and then ask it questions about it? That's a much better way to use it than hoping it has access to the latest version of anything.
Exactly what I was going to suggest, I've done this many times! @@DodaGarcia
@@DodaGarcia I believe the problem with the current models is that they have too much data. They are very good with language, but do not really have an understanding where they lose predictive power. Maybe better have a smaller model only trained on language and some basic logic/data processing and put additional information like an bigger version of wikipedia. And than sample the information out of that and create the response of that data. Would give much better control and avoids hallucination.
1+1=2 is an equation. 2e^2+5j=0 is a curve. Math programs should be able to identify functions pretty easy. Just use the right program for the right task.
But when we talk about AI we expect real understanding - not doing predefined tasks.
1+1=2 is an identity, both are equations
5:03 disappointingly this applies to a lot of peoples behaivors too:
" seem good at first when you ask it simple questions
but as you dig deeper they fall apart and get increasingly inaccurate or hit artificial guardrails
and only provide surface level responses"
I mean that's valid. I know I have this problem just the same.
There is an important difference though which is why the "race to AGI" is so heated.
The junior fresh of an internship can actually finish whatever task they are assigned even if it's complicated (to a reasonable extent). At least the ones I've been around certainly can.
The AI, no matter how much compute you throw at it, cannot. It doesn't matter how magnificently it can write that singular function with matching tests and ci/cd.
These AI systems cannot plan. Period. They are incapable of taking an unknown task and breaking it down into more manageable chunks. Splitting them and creating more as needed until the task is complete. OpenAI's apparently claims that o1 can? Not that I've seen. You still need to do everything yourself. The AI just makes things go faster. But maybe that's a skill issue on my end. That's not lost on me.
The key step is apparently AGI will be able to do what the junior can do.
I don't see it happening without some major architectural overhauls. We have scaled out everything he currently can. Now it's a waiting game to scale compute.
@@TheNewton have you seen the auto gen reply feature?
People are capable of communicating their specifically-relevant limitations and showing humility, and I doubt many people would presume a random person to be authoritative as too many people seem to presume LLMs to be.
This is a very good take.
@@LiveType, AGI is just a name for a thing that nobody knows if it will ever exist. And there does not exist any solid reason for why it ever would. There are known problems to be solved until an AGI is possible, and those problems concern the very nature of knowledge and its representation. How do you make an electric signal or a number on a spreadsheet aware of itself or other signals or numbers? You need to answer this to realistically believe in the possibility of an AGI.
3:53 Tiny nitpick: it’s more auto-complete than auto-correct. Think predictive text, not your phone deciding “ducking” is your favorite word.
oh crap, you're right. to the ERRATA comment!
@@art-thou-gomeo thankfully the distinction isn't too damaging to the layman who may be interested in learning more.
But the word ducking is my ducking favorite 😞
Duck whoever it was who decided to mess with my phone's keyboard and stop me swearing 🤬
2:41 "The more specific the answers you want, the less reliable large language models are".
Very well put! I would also add to that - the greater the delta between your expected precise answer and the vague prompt you give, the worse the LLMs are.
right!
This phenomenon of selling promises to investors needs a boilerplate name that captures the imagination of the relevant audience to propel it to semantic immortality and ubiquity.
While LLMs may not be great at reasoning and suffer from hallucinations, I find it invaluable for summarizing long research papers, creating outlines, brainstorming, and helping me express my thoughts when I’m having trouble finding the words. I wish companies would advertise these advantages instead of promising stuff that isn’t true.
@@maybethisismarq The companies actually selling LLMs do advertise those features to make sales, e.g.: Microsoft has auto summarization as a product of their Teams platform, if a meeting is recorded and you pay for the Teams "Premium" you will get a summary of everything discussed in the meeting afterwards, it's a pretty good feature, and it's multilingual.
But yeah, you won't hear the "AI" companies seeling this feature because when you think about it it's a feature that already needs a platform to be useful, or do you think if OpenAI started a Teams competitor now every company that already have contracts with Microsoft would migrate it to OpenAI? Most of the AI companies are advertising features that don't exist because they are selling the idea of those features to their investors as a miracle that will bring 100x more profit, if they were to present plans for actual things LLMs excel at investors would realize most of the AI companies don't have the platforms to apply the AI to, the market is already capped for those and the companies would actually have to have a business plan for this, and when investors see a 1.5x return on profits over 2 to 5 years they would not be interested.
On the other hand, selling a promise for something magical in just some years that will get them 1000x ROI, on boy, they really want that...
All in all I can see that Microsoft is one company that's actually integrating LLMs into useful things and getting real money from it.
Facebook has been great due to their Open approach to releasing the Llama models that's sparked more open models approaches from other companies (Like Qwen from Alibaba and Exaone from LG).
I agree, and not just that, but their ability to code is extremely helpful. It can turn someone who doesn't have a previous background in the syntax of a new language an edge like never before.
Right! Language ability is SUCH a killer feature, they don't need to invent other stuff too!
In the anime Frieren: Beyond Journey's End, the way the show portrays demons is simply as monsters that can use human language, however they have no understanding of the meaning of whatever they're saying. They understand how humans react to certain uses of language, and they will simply say whatever would get the desired reaction. One demon might start talking about their father so that a human would become less aggressive toward them while also having no idea what a "father" even is.
This strongly reminded me of modern language models. They don't ever say what's accurate or true, only ever what they think should come next (and considering how training can work, it's largely to get a desired reaction in the form of approval from human trainers). They're not artificial intelligence. They're language models and they do nothing but model language. The problem largely lies in many people mistaking language for intelligence. Just because something can use language, like language models, that doesn't mean that thing is intelligent. The reverse is also true, where some people can be dehumanised because of an inability to use language due to various disabilities.
@@angeldude101 you are confusing sapience with intelligence. A system doesn't need to understand all aspects of language to be intelligent.
Your confusing artificial intelligence with synthetic intelligence.
@@Hollowed2wiz
Not even. This is more the Chinese Room thought experiment
wow its almost like language is extremely important in communicating information? woah...
"It's just like my anime!" But unironically 💀
I feel like I’ve been misled into thinking that LLMs are genuinely smart. They certainly do a great job of appearing intelligent, but there’s a big difference between truly understanding something and just predicting the most likely next word.
Is there? I think that statement would require us to have a concrete, objective description of what "truly understanding something" actually means that can be tested.
@@somdudewillsonI agree.
we have indeed been mislead
@@somdudewillson well, not really - if you try hard enough - you can trick an LLM into giving away the game. You can do things like ask it a maths question, then apply the same logic to another question but using a real world scenario to frame it - you suddenly realise it cannot apply reasoning cross domain. That is a simple to understand, and widely accepted principle of understanding. I you truly understand the concept, then presenting the same problem in a different context should be easy to solve. LLM's can fail at this.
“LLMs” are just predicting the next word” is a talking point that can be used by anybody wishing to discredit them. The decoding step, in which the hyper-dimensional array of information output by the neural network is transformed into readable text, uses probability to pick a series of words that best express that information. To say that the decoding process is the entire process literally ignores the existence of the neural network and what makes big ones smarter than small ones.
My favorite example of ChatGPT breaking down is actually when you ask it about a logic problem that has lots of variations present in the data. If you ask it about the classic puzzle where there's a goat, a wolf, and a cabbage, and you have to take them across the river in a rowboat, ChatGPT will give you a mishmash of answers to similar riddles and it sounds completely mad.
4:56 reminds me of my experience with asking GPT about programming stuff. Common libraries, general questions, usually you’ll get good answers, but as soon as you start to ask for something a little bit unusual it all falls apart. Hallucinations, random invalid logic, the works.
While I agree with many of the conclusions on how to think about current AI tools as a consumer, I think the analysis on the inherent limitations of gpt-style systems ignores a lot of the research going on atm. We know for example that llms do actually develop an internal world model. This has been very explicitly shown with Otello-GPT, a toy llm, that was trained on move sequences of the eponymous board game, where researchers where able to fully extract the state of the board game just by looking at the activation space. Recently further research has found similar results for non-toy models like Llama 2. Further research has to be done of course, but it might turn out that to become really good at predicting the next token, eventually you have to understand what you're writing about.
There's a lot more going on of course and I'm definitely not arguing that there aren't still significant hurdles to overcome, but simply arguing that "this thing learns to predict language, therefore it can only ever understand language" isn't quite right either.
I look forward to testing their claims.
"understanding language" is an enormously powerful feature for a system to exhibit. Ultimately pure mathematics is just language with the additional constraint that valid grammar (i.e. constructing one's sentences from accepted axioms and inference rules) implies a "correct" (relative to axioms and inference rules) result. I think people need to remember, however, that this power embeds Turing-completeness into the system. And we know there are very rigid constraints on what is computable, and what problems appear to have infeasable trade-offs in their computability.
@@danielkruyt9475 exactly. "Language" is not merely a collection of words. It is embedded with knowledge. Understanding language implies some non-insignificang level of knowledge.
@@franciscos.2301 no. Language is not embedded with knowledge, it is used to communicate knowledge. Big difference. Large language models can create complex maps of word interrelatedness which may allow it to even appear to have the ability of inference, but they ultimately "know" nothing except how we string together words. Since we use words communicate logic, they can appear to be logical because they are able to put words together in ways that we would put words together.
Thanks for bringing this up. That was an interesting white paper. I appreciated this video but it did give the impression of being written from the perspective of an intermediate/advanced user, and not someone with machine-learning experience or background. Even from my cursory understanding about it, when it comes to domain or niche knowledge, for instance, I kind of thought "well what about RAG, chain-of-thought, or alternative and underexplored architectures besides transformers?" I really feel like deployment by commercial firms is overhyped and premature, obviously, but that doesn't mean that there isn't a ton of depth left to this rabbit hole. The idea that even GPT is essentially an outrageously trained autocorrect belies the fact that we actually still barely understand how these models are actually working, especially as they grow exponentially in parameters and scale; hence Otello-GPT.
2:15 Yes! I love this feature, I use it to reverse search a definition (and any other criteria like "word begins with the letter m") to find words that are on the tip of my tongue.
That's it. I haven't found a good use of LLM/GPT/whatever anywhere else.
oh it's GREAT for getting money :-/
LLMs are good for advanced sentiment analysis if your concern is data science-y. Previous sentiment analysis used to attach positive and negative weights to words and then just count them up (eg, looking at a review of an airline company, "delay" would have a negative weight). But this lacks nuance in terms of both the domain and language quirks like sarcasm. Whereas LLMs are much more proficient at "reading the room".
(Technical note: this is almost certainly due to the attention layers, that contextualise each word according to its neighbours, as well as just the whole "trained on the entire internet" thing)
Personally, it's an alight upgrade for duck-debugging/Stack overflow trawling to help tackle some of those basic hurdles you can run into. It's not perfect by any means, but as someone who doesn't directly code often, it's helped me throw some scripts together.
Im pretty sure chatgpt can destroy you all in a debate
It's great for generating (and for explaining) ffmpeg commands
Dear Trist,
Your video contribution is excellent.
I am absolutely thrilled with the way you provide a technological, sociological and economic analysis of the current state of AI technology
in such a short space of time.
I have watched more than 8 comprehensive videos from renowned
RUclips channels with academic analysis on the limits of AI. Your video absolutely sums it up in a short time
without losing any of its importance and explosiveness.
Outstanding achievement!
Thank you for your contribution!
"The more specific the answers you want the less reliable LLMs are." So, fundamentally no different than asking a human, consulting a book, or searching for answers on message boards. I don't understand why the bar for artificial intelligence is so high while the bar for humans is so low. If you and ChatGPT took a quiz with a variety of questions from different domains of knowledge I am 100% confident that you will do worse, while taking about 100 times longer to do so.
It's almost like the pearl clutching while screaming "BUT IT MAKES MISTAKES!!" is motivated by a pervasive deep-seated fear. 🤔
You are right of course, but the problem is that a general LLM isn't being pitted against the "average" person, but specific people. When you need a job done, you expend a great deal of effort to find the right person to fill that job opening. When you have a knowledge question, you can study specific books that contain an expert's knowledge of the subject. LLMs can compete when low level tasks, like doing schoolwork or, say, writing ad copy, but if it is going to be worth the monumental cost of hardware and power and human effort it has to be way, way better than the average person. Average people are cheap. The world is full of them. It has to be better than a trained and experienced person in a field in all fields that they wish to apply LLMs to.
I don't mind that it makes mistakes, the problem is that due to the nature of GPT is that it's hard to spot them.
You're smart enough to know how to use it and where it falls short, but the companies name their products very deliberately (such as "Einstein") to give a misleading impression.
right!
Yeah I really don't get that either. People can be people but AI has to be absolutely perfect. Do people not realize that once that happens, we all become expendable? Plus, the tech hasn't even reached its final stages yet. This is like complaining in the 1950's that computers are only good at doing calculations and nothing else. Yep, solid insight you got there buddy.
I'm glad you brought up blockchain as a point of comparison. Through this whole hype cycle, I've often been reminded about that bike share program (I believe it was in the Netherlands?) that was built "on the blockchain". It used the headline hype to gain funding and ran really well. The more credulous press looked into it later down the line, and the devs were upfront and honest that the software only used the blockchain in a purely incidental way. It was actually completely ordinary, boring, functional, valuable software. If we must live with AI hype, I hope it can be in that manner.
the zed code editor uses the ai selling point and I hope they just use it as that only. There’s a lot of promise in how the rest of the product works (performance and live-multi-person editing support) and I hope that in the end we get that with a little optional chatbot tucked away somewhere. Noone needs these things to be more than chatbots and slop generators.
AI sucks because the frameworks they're using are crude, and they're using massive knowledge-as-power models because that's what's easy for them. I think, once they make the leap from abstract neural intuition to formalised framework processing, then we can worry. And marvel again
(It's still pretty good but we know the limitations now)
I found your comment interesting, so I asked ChatGPT o1 full model, - what does this mean to you? "the leap from abstract neural intuition to formalised framework processing"
Its response,
"this phrase often refers to the process of moving from a raw, intuitive understanding of something-like the way a human brain instinctively “gets” a concept-into a more rigorously defined, structured, and often mathematically or logically framed model. It’s the shift from “I know this pattern when I see it” (an abstract neural sense) to “I can describe and manipulate this pattern using a systematic set of rules, equations, or algorithms” (a formalized framework).
In other words, it’s the journey from having a vague, gut-level feeling that something is true or meaningful to actually translating that feeling into a precise representation that can be analyzed, tested, and reliably applied." LOL good phrase!
I develop programs using llm's. It is not as bad as you suggest. In order to be useful at higher complexity levels:" design patterns, architecture, interrelation of classes, program flow, ..." you have to use certain technique and relay on .sh scripts. All in all thinking about it, it is a darn outlandish experience communicating with an Alien Intelligence.
Yeah, the things people criticize about LLMs are shocking to me. The technology is fucking extraordinary, and nobody is forced to use it anyway if they don't find it useful.
It is not at all reassuring to know that your idea of error detection and correction is shell scripts.
2:47 ADVENTURE TIME MENTIONED. Literally in my top 3 favourite shows of all time.
GODS I freaking love it! I dropped off about half way through the show, because I was watching it too fast, and Simon's backstory made me feel feelings. I should finish it
@NoBoilerplate oh you definitely should. It gets so much better from there
@@NoBoilerplate please watch it I've been watched a bunch of children's animated stuff I never watched when I was a kid and adventure time is by far the best. The finale made me want to bury myself in a hole and let nature reclaim me (positive)
@@NoBoilerplate Definitely worth it! One of my favorites. I haven't seen the last season because it's too good to die.
While ranting to my friends years ago, I brought up that dragon aswell! Surreal to see someone make the same connection.
Try to get AI to draw elementary shapes for art students-spheres, cubes, cylinders, cones, and pyramids-without any additional elements. It cannot, no matter the model!
ChatGPT explanation: While AI models like ChatGPT cannot directly "draw" shapes, AI-powered tools (e.g., DALL·E or MidJourney) can generate simple shapes. However, ensuring they are elementary and without additional elements can be challenging due to AI's interpretation of prompts, often adding artistic flair or extra details.
1:45 I've also seen that in a talk from Dylan Beattie. He said something like "Technology is what we call stuff that doesn't work. Once it works, it's no longer technology, it's just stuff." Although I think the quote originally came from Douglas Adams.
Rings very true. I love Douglas Adams, his books are a primary source for my writing for Lost Terminal, I wonder what you think of it? ruclips.net/video/p3bDE9kszMc/видео.html
@@NoBoilerplate I watched season 1 probably like a year ago. Thought it was neat. Kind of reminded me of Wolf 359, even thought it's pretty different. But I never got around to watching the rest of it. I'll see if I can get back to it eventually.
Something I find funny is that with the complete deterioration of search engines you can kind of trick LLMs into being a search engine
As a software developer working on pretty low level and template heavy code, AI (I've only used GPT-4o) is very good for a few quite specific use cases:
1. Cleaning up and summarising verbose compiler error logs which can be hundreds of lines for a single issue.
2. Generating very specific functions and/or metafunctions with a clearly defined set of inputs and outputs and how it should transform them.
Trying to work with multiple successive prompts never really works, it's better to pessimistically assume the AI will not keep any knowledge of past prompts and just be happy when it remembers something useful from the session.
this.
When I hear about a new flashy thing that’s supposed to materialise some day soon I’m thinking to myself: ok, sounds great. I believe it when I see it 😊
Jokes on you. I WANT the Madness!
I'm all for a bit of madness! Case in point, my audiodrama show, Modem Prometheus, have you heard it? ruclips.net/video/7pcbRQ4L4vc/видео.html
Speaking of which, did openAI ever release (or did anybody recreate) that horny chatbot mentioned in that one Rational Animations video? Asking for a friend of course.
It's pretty good Ska Pop for sure.
I mean, you install a local LLM executor (e.g. ollama), download an "abliterated" model (safety-rails removed), and set the "temperature" of the model to something nice and spicy (like Pi^e, 22.4592) rather than some boring low number between 0.0 and 0.6
Have fun with your raunchy, scatterbrained, unhinged chatbot.
@@TheDoomerBlox i don't need an LLM for that, i can just talk to myself :D
“Machine learning” means we understand how it works. “AI” means we don’t.
You are a smart fellow. I appreciate this take and very much agree with it. I have been criticizing GPT since it was rolled out, and if you say anything out loud, a bunch of idiots come for your comment. I didn't know how these things worked at first, but then I- oh, I don't know? Actually studied what they are and how they work? 🙄I came to the conclusion that this technology alone is very unlikely to ever become Data from Star Trek, which is what they want you to think it is. It's actually more like a (somewhat altruistic) superficial sociopath with no need to breathe, and thus can deliver more lies-per-minute than a biological system cursed with the need for oxygen.
Claude is frequently a better model in this department for its more "responsible" tone, whereas GPT will lie, and lie about its lies, using all kinds of weasel words and passive voice to avoid responsibility and obfuscate the issue. I am a solo games designer and have had to use these tools to help teach myself C# this year, but I find it often much better to simply buy and read books. While I generally agree with the video that these LLMs are great for broad or superficial, widely accessible information, what I actually find more common from GPT is to deliver a mostly accurate essay on some aspects of coding, and yet then stealthily bury in the middle of that essay something that's wholly and fundamentally untrue, even dangerously wrong. It makes you suspicious that perhaps these things aren't even mostly altruistic..
I was only able to start spotting these inadequacies once I reached intermediate proficiency with C#. I can now recognize how they use common code examples and patterns I've seen online (for both C# and Unity) and shoehorn them into all kinds of inappropriate use cases, as my queries get more and more specific. These tools are extremely dangerous to a junior student trying to learn to program. I've said that over and over and the comment vultures always say "Well you can't *only* use GPT," as if it's some kind of catch-all, obvious answer. It is not. 1) Look around and you will see how many people are almost only using GPT. 2) Even if one is not, you *cannot* fact check and correct its lies when you "don't know what you don't know."
I think humanizing it in any way is misleading. It's not benevolent, it's not non-benevolent, it has no motivation. Not neutral motivation - no motivation. It's content without any creator, a regurgitated content. Like an algorithm that takes text amd randomizes words in it - is it good? Is it malevolent? Is it telling the truth? Is it lying? No, and approaching it this way makes no sense. The questions themselves make no sense
@@NJ-wb1cz It does one thing and one thing only, it looks at a text output and guesses at what word comes next in sequence.
That's it, it is really good at that one thing but it is not AGI, as AGI requires either A, a fully simulated human brain or B, creating something capable of self awareness which right now is as of yet not possible, the agents we have today are still specialized to the specific tasks they're given while an AGI should be capable, like a human child, to figure out any new task presented to them and be able to perform it relatively well without losing efficacy in other tasks.
@@gavros9636 we don't know what self awareness is, and don't know if a model of a brain would have it. All of this AGI talk is highly made up and speculative
When it comes to the current models and the way they work, all they need is to completely change the training and the training algorithms, and essentially raise millions robotic babies how they would raise a human one, for the same amount of years, with the same effort to provide thwm the same social experiences in different conditions etc. Have them process at least the same inputs human brain processes, preferrably better ones. All the smells touches sounds visuals, etc. I don't think that's in any way feasible in the foreseeable future
And the theoretical discussion about "self awareness" and "AGI" don't matter. These are abstract fantasies, not actually anything tangible.
The best way to think of it is as an auto-complete, like when you start typing a search into google and it suggests a list of searches you might want. LLM's are just a larger version of that auto-complete feature. When is that google auto-complete feature useful? When you're looking for something popular. When is it not useful? When you're looking for something unpopular.
3:17 isn't the solution to this limitation RAG + agents + tool use? Perplexity search engine seems to prove this. Its why shipping larger context windows was a major focus of llm improvement for the last few years.
GPT is best suited at the human/computer interface, and we've seen really good uses here. BUT the claims suggest that anything is possible, which is simply not true
Language over knowledge is such a great way to describe these models’ capabilities. I’m a mechanical engineer and I use GPTs all the time to point me in a direction when I need to solve a problem. Some of my coworkers on the other hand use it to solve a problem for them and the results of that have not been great.
Well done!
The main problem I found with AI is that I'd ask a question, and it would give me a generic answer that was mostly correct, but then I'd ask for specificity and get none. In particular, I was asking nerdy D&D dice statistics questions. The explanations always looked good, but the math was always always always wrong. Even after I told it the right answer, it will still respond with a wrong answer.
"It's always easier to promise a bright future than build a better present." Wow. Politics in nuttshell.
Mid tech but it's only been 2 years since chat gpt came out. Compare progress of AI from 1950-2015 and 2015-2024. The rate of progress is insane. It's easy to just look at day to day changes and it feels so slow. The internet took about 20 years to reach mass adoption!
I think AI is truly different this time. There's so much research and money going in that I think next year and the year after is going to be pretty intense. This isn't something hard to make use of like crypto.
Some things actively in development
- Dynamic computation models (think longer and harder on more difficult questions)
- Reasoning based model (Don't teach the model the answer to the equation, you teach the model how to solve it)
- Reinforcement learning (Instead of teaching the model the correct answer, teach the model only if what they did is wrong or right. This technique was used to create AlphaGo which beat the human world champion)
- Test time compute (Ask the model to try answering it multiple times in parallel and choose the best answer)
- Incremental learning (Train the model during inference. This is what your brain does. Anything that you memorize is stored within your biological neural nets)
There's probably more techniques that I don't even know or none of us even thought of yet.
To the claim "it's just a fancy auto-complete". Imagine this: You read a detective novel up to the point where the detective says "The criminal is...". You have to figure out the criminal without seeing the answer. In order to do that you have to understand the story, make couple of theories and finalize on an answer. The name of the criminal is based on all the text before that. Now imagine if some one or something can do this consistency. Wouldn't this be true intelligence? I think just because a model is trained on the next best word, that doesn't mean it's a dump auto correct. It's a asymmetric relationship. A dumb auto correct is algorithm to predict next word. But not all entities that can predict the next word is a dumb auto correct. Take a human for an example. We can in fact be a pretty reliable auto correct "AI".
Also the base model is trained on auto-complete but a model goes through different iterations of fine tuning. This is where we get "assistant like behavior". If we didn't fine tune the model it'll literally just be an auto complete. This is why llms asks you to clarify your question if you send "How do you ge" instead of trying to complete the question that you prematurely sent
Mark my words, 2025 and 2026 will look insane and it will be one of the fastest growing technology (even when compared with smart phone or the internet)
Consider the example of fully self-driving cars. Remember the promises made circa 2017-18 and the appeals to "exponential growth" and "look how far we've come in the last 4 years alone" type claims? Where are they now? We've got a few slow moving robo-taxis that work in a select few neighborhoods at best. Actual technological progress in these types of "nascent" fields happens in short bursts of growth (which is way faster than exponential), followed by plateaus. It's impossible to predict when/where these plateaus will be hit.
All the things you mentioned are promising avenues, but it is not really convincing to say "think of how much more new SotA tech we might get from all these research directions". LLMs and transformers were themselves one among many ideas, most of which went nowhere (at least in comparison to LLMs).
And the appeal to more compute has similar problems. Since we don't have a theoretical foundation for what results in intelligence, we have to just hope more compute or more data or some new architecture solves it. In something like theoretical physics, you could reasonably expect to model a new hypothesized particle and predict what range of energies you need in your particle collider and decide if you can build a big enough collider. If it turns out to that you need a 500km collider and the biggest we have now is 27km long, you could just say "we dont have the technology to detect this yet" and look for other avenues. But with AI you just have to hope and pray your new approach ends up being worthwhile.
Demis Hassabis has recently said that its an "engineering science" in that you first have to build something worthy of study and only then can you study it. There's inherently more uncertainty in that process compared to "normal science" where hypotheses and predictions can be made with a greater degree of certainty.
@@psd993 Interesting you bring up self driving cars since that's the main AI progress I've been following closely
I'll just say FSD v13.
You'd be surprised next year when it starts entering main stream market ;)
ruclips.net/video/iYlQjINzO_o/видео.html
@@Aosome23 I agree with most of your comment but I'm skeptical that 2025 or 2026 will be "the year"
It's always N+1
@@NoBoilerplate we never thought we'd get on the moon until we did. Rome wasn't built in a day, and Skynet will be no different.
I rarely write comments but I felt this was worth addressing.
The points made in this video are valid to an extent. LLM's on their own are generally useless, hallucinating auto-complete machines. Its true that profit motivated project managers are feining to present anything AI branded to their stakeholders. We are not where we need to be now for AGI.
Despite all these facts, the technology is advancing about as quickly as we would expect. By introducing chain-of-thought and multi-modality to GPT we have effectively expanded beyond the limitations and are rapidly approaching the horizon of AGI. If you compare GPT 3.5 against GPT o1 its a night and day comparison in all factors.
This video judges AI on the notion that it is fully fleshed out and that's just not true. We are still far from having a Jarvis-type assistant for everyone, but its not as far as people are making it out to be.
Probably the greatest/most overlooked thing LLMs provided me with is making voice input actually useful.
I can just turn on 'voice keyboard', for example loudly list items I'm going through and with simple prompt LLM will make a good, markdown list from my word salad
This
Do you have a recommended workflow for this? I struggle with "too many ideas, not enough WPM" from time to time and this sounds useful!
Exactly. It's frustrating to hear the constant criticism about the limitations of LLMs' knowledge while the "language understanding" aspect of them is what really makes them shine. They're fantastic for summarizing/parsing arbitrary information.
It's been fun to use it as a language learning tool. As you say, it understands language really well, with increasing quality the more common the language is. I should really try it on some obscure language sometime, maybe a conlang. Man, imagine talking to an LLM in Sindarin.
ha, amazing! Maybe something big like lojban might be in there, maybe even toki pona? But the less information there is, the worse it is 🙃
right as i'm getting fatigue from prompt engineering and taking on an AI related role at work, this pops up
i'm half tempted to pivot int working on VR instead
GPT is great, but only for language tasks!
@@NoBoilerplate my company doesn't want to make it do language tasks tho, and prompt engineering is it's own special hell, but we shall see what the future holds i suppose
VR is still bound by the hardware.
I've been there about a decade ago.
Hardware has not advanced nearly as fast as I thought it would have. It's progressing at maybe half the speed I had expected back in 2014-2015.
I've been working with VR for years. Go for it if you can, it's a very fun space to work in.
4:14 “…and we mistake language proficiency for intelligence…”
_And_ reasoning and understanding and some inner psychological goings-on and who knows what else? It all amounts to a giant attribution error. (It’s easy to see why-in the millions of years of human evolution, the only beings who could respond meaningfully to us via language were _us,_ and, therefore, these large language models “must be” like us in all sorts of other ways, too.)
On another channel I watch involving AI there are always these references to “reasoning” and whatever and I say these models are _emulating_ the verbal behavior that we associate with reasoning-it’s _not_ reasoning. That’s it.
I _will_ say, as you do 2:11, that these language models are really good, perhaps stunningly good, at language tasks-acting as a thesaurus, translating, cleaning up awkward or ungrammatical text. They are, after all, _language_ models. But they’re _not_ intelligent. They’re like the savants of the computer world-highly proficient, even exceptional, at language, and surprisingly deficient at everything else.
I haven’t seen a video online express these ideas so clearly until this one. It’s really excellent.
Thank you so much! I have a close relative who has trouble speaking fluently due to a medical issue, and it's shocking to see how immediately people think he's stupid. We're SO hard-coded to guess deep insights by language proficiency!
0:28 TARS and CASE 😔
@@kwekker ... copied from 2001', HAL
@@ethzero oh I just mentioned the cool AI/robot characters that first came up in my mind when he mentioned this
DONT LET ME LEAVE MURPH
@@ethzero so? They're still very funny and cool ai's
You’re totally right, but what concerns me isn’t the capability limits of LLMs, but what they might achieve with clever interfacing. Like the “theorem solver” mode mentioned in the errata. Do you think, if the accuracy is high enough, this paradigm of “autocorrect on steroids” can actually do fancy sci-fi stuff by combining a giant number of different models with clever interfacing?
for sure. The genius place for LLMs (as apple are doing, to be fair) is in the interface between human and machine.
I like prompts like "What are advantages of the A* algorithm - give me sources". With this i almost always get summaries that are easy to understand of papers or articles with links to papers that describe the problem in more detail.
I keep hearing the same complaints about AI.
Personally I use it to generate boilerplate and then go through line by line and analyze what everything's doing.
It's infinitely quicker than typing out 300 lines of code by hand.
Are you really have to do is go through and make sure all the logic lines up with what you want.
Maybe for someone like primeogen who's been coding for 30 years it's quicker just to type it by hand.
But for a junior engineer it's much quicker and easier to use AI and then patch up whatever little mistakes it makes.
Also I'm wondering are people trying to use AI for a whole project?
In my experience in only works well at one objective at a time.
Senior developer here. Thank you for helping provide me with a job for the next several decades. I'm the experienced person they bring in to clean up critical bugs deep in the software, that require deep knowledge of the languages, libraries, and runtimes involved. Every single time I touch a codebase, there's always some critical bug around data concurrency, memory management, type system semantics, security vulnerabilities, or some other problem involving a lot of crunchy algorithm knowledge. There is a zero percent chance that an AI understands the underlying memory model or algorithm design or language specification well enough to write code that's not going to completely fall apart when it's thrown into the real world. So, thank you for helping keep me employed. I'm not a fan of dealing with large legacy codebases, but it pays well enough, and those codebases will be around forever. As long as contractors are prioritizing deadlines, and as long as developers are using AI and "touching it up", I'll have a job rooting out critical bugs that require a deep knowledge of core computer science topics.
If you don't want to be in the business of helping create jobs for more senior developers, put in the work. It's what Primeagen preaches. He's a good developer, but you don't need to write code for 30 years to be that fast on the keyboard. If you want to vomit out 300 lines of code, just get good with your tools. All it takes is a little bit of time and a ton of discipline. Take a few months and drill your keyboard shortcuts. Move your mouse to the opposite side of your desk, and only reach for it if you spend at least a minute fumbling through your shortcuts. Even better, learn about alt+tab, and google the keyboard shortcut you need before reaching for your mouse. You'll know you're done practicing the first time you feel like it's too much effort to move your hand to your mouse. Do that practice for even just a few weeks, and you'll be twice as fast on the keyboard. Do some typing drills, and work on your muscle memory. A few hundred lines of code isn't anything difficult. It's surprising and pathetic how often developers can't write code. "Why Can't Programmers.. Program?" is getting close to 20 years old at this point. Don't be like that. Be better.
After your muscle memory is good enough that your hands magically start converting your ideas into character in your IDE, work on algorithms and design patterns. And just write a ton of code. Practice toy problems, and build some real world applications. There's no experience more valuable than sitting down and trying to create something yourself, without any notes to copy. If a year of programming practice sounds tough, remember that some fields require 7 years of school. You can either spend 30 years not caring and slowly learning through accidental lessons, or you can dedicate a year or two and really master your craft. Go read the top posts on classic blogs like Joel on Software and Jeff Atwood's Coding Horror. Dig incredibly deeply into the crunchy theory of computer science, and binge through videos like "Parsing JSON Really Quickly: Lessons Learned" or "Performance Matters by Emery Berger". Both those talks cover the complexities of modern programming, rather than the overly-idealistic tutorials that teach the average developer. Read all the hundred page technical specifications for your language. Watch the deep dive tutorials on obscure features, and then go build a real application with your newly found knowledge. Dig as deep as you can into the crunchiest subjects you can, and you'll be rewarded by having more knowledge than most of the other developers on your team.
Don't shy away from challenging yourself while you're learning, and you'll be ready to meet any challenge in the real world. The more you practice, the more you learn. The inverse is also true. Primeagen stopped using AI code completion tools because they were stealing the valuable time he used to practice writing code while he was working on real problems.
Yeah, just make sure you are learning your craft. The thing with this is that the act of writing code is what creates the memory, the experience, the knowledge - if you had that over to a tool - you may find that you're actually not as good as you feel you are at programming. Turn off the tool for a bit - make sure you actually know what you are writing. As prime mentioned, when he turned it off, he realised he was constantly waiting for the editor to complete his code, and that he was effectively doing himself dirty by not writing the code himself. That doesn't mean you shouldn't use it - but just make sure you aren't losing opportunities to learn by handing the wheel to the AI.
@@alexlowe2054 You know you can give advice without going on and on about how much better you are than people less experienced than you, right?
At their best, they are a good addition to a search function. Giving summaries and sometimes picking out just the right tidbit, like googling for a stack overflow answer but not actually having to go through the results manually.
But that's about it.
yup!
In my experience, the people who really truly believe in AI and who go out of their way to research it are all trying to answer that last question you posed, is GenAI one of those systems that only requires more time and more computational power to get better and better? It's interesting because we've been sold Moore's Law so hard in the past that I think people are assuming the same thing will happen with AI. Personally, I think GenAI will plateau until the next big advancement of AI models comes out, like how transformers caused this current set of AI breakthroughs. But I do think it's a difficult and ambiguous question with possibly no right answers
fair. I think the geometric increase in complexity the larger the model feels quite self-limiting, so I'm for the plateau theory
One of the surprising things is that it keeps not plateauing. It's easy to be misled about this because a) so many people keep saying it has plateaued and b) there are many very plausible ways in which it should and probably will plateau at some point. It just hasn't yet.
@@andrewdunbar828 It is though......each new model increase has been less and less than that of 3.5 to 4.
Actually one thing I was impressed with lately is using it to design a database schema and basically organize my messy human thoughts and eventually have it write the actual code for the entities and their configuration. Previously this is something I remember took a lot more effort to organize my thoughts and make sure relationships make sense etc
yep, easy stuff is easy, good for boilerplate, but 90% of the work comes in the last 10%
I think it's pretty same story with humans... They also approximate stuff that they learn. Many things in this universe is so complex that you barely can know everything about some subject/topic on all levels of abstraction. And even if we know everything on some level of abstraction (like math because it's defined) we don't know everything on another levels that we apply math to. For example physics.
Take my comment with many grains of salt.
LLMs are a huge step towards making computers more accessible, but language ability is only part of the problem
Discrete math is a trivial example of knowledge that is limited to a null-subset of the entire domain. That is basically what Goedel found out. ChatGPT can't even do math at the high school level reliably, let alone get around Goedel.
Maybe a better a name for it would be Artificial Average Intelligence.
As a frequent GPT user, I can say it's just a massive time saver. Research that could take a week takes minutes. The fact that I can do so in natural language without learning some arcane coding or complex interface. No matter its limitations, learning how to use the tool makes you many times more productive. If nothing else, having such a powerful tool publicly available is worth all the ways people are learning to get value out of it. They may have made unfulfilled promises to investors, but maybe they are fulfilling promises they couldn't have known about. How will the tool affect how research happens? How quickly you can fact check something? How projects are drafted? How problems are solved? I use it for all of these things. It's apparent reasonableness and comprehensiveness makes it quite authoritative. That alone saves time.
I have a feeling even if there were a lot of phd lvl resources , Current LLM's will still struggle.
"More specific Less usefull" - This is subjective ,for ex in generating code the more specific the prompt better the results for me.
I was worried that title was going to be confusing, sorry. It should read (and I hope the context of what I said over it makes clear):
"More specific requests, less useful responses"
The graph you've shown is quite succinct. I work on things that range from medium to high complexity, and when it comes to anything that is more complex I'll have to do it myself. I don't rely on AI at all, I use it as an alternative search engine when a solution is hard to find. And even then I have to be careful because it's stupid quite often.
It's like I said in my "Renaissance" video of a year ago, GPT is like an intern. It'll get you an answer, but you'll have to check it!
I was testing Claude out by asking it to write a function in Haskell, and it did surprisingly well-BUT it suggested using record syntax to change the month in a Data.Time Day. I told it that was clever before finding out it didn't compile, because Day *isn't* a record. it corrected it to something that worked.
later, in the same chat, I asked it to make a change to the function, and it tried to do *the same thing,* ignoring the correction, I guess?
it's interestingly clever, while at the same time being interestingly stupid
@@thoperSought long term memory and parallel thinking are knowingly some of the biggest gaps of current AI. For the memory part the companies itself put limits for chats
@@invven2750
this makes me wonder a couple of things:
the model they let free accounts use changed between the first and the second parts of that chat-would it have made a difference if the model stayed the same, or is it just not taking the whole chat?
"It is always easier to promise a bright future than to build a better present"
An excellent line
I think you explained perfectly why I struggled to use something like copilot. Its like autocorrect trying to correct a word its never heard before- I have to frustratingly delete the change it made only to see it make the exact same mistake a minute later
I work with eccentric fields of programming and copilot would constantly generate nonsensical junk or outright duplicate what I'd already written. It certainly worked better when I was working on my website, but in the end I turned it off after 30 minutes of leaving it on.
all that about ai startups is true
chatgpt is actually pretty decent at math now tho. i often ask it for help with my uni test questions and while it does make mistakes from time to time (like adding a minus or something) it's generally accurate, even with more specific and complex stuff. I think what it does differently is it generates invisible text that helps what is visible be more accurate (for example extra tiny incremental steps for the math questions) and maybe it has an interface to a calculator too (so, making this up, say, if it returns "[=30/6]" the system would replace that with the result before continuing to generate new data.
knowledge wise it's also gotten better. it now looks up what you are asking on google, evaluates the reliability of the results and takes input from them. it's quite impressive
"maybe it has an interface to a calculator too" ... if it's giving correct answers for math, that's without a doubt what they've done: recognize math and hand it off to something that can do math, something that's _not_ an LLM.
Another banger! I really see LLMs as letting us freely move and navigate through semantic space, it's GREAT at transforming and shaping language you can give it, and its training usually makes it "good enough" to smooth over the patches it could be confused about. I use them nearly every day for learning and research. The main idea is not that the model just "tells me what I need to learn", but that the model can combine ideas, turn them over, split them apart, recombine them, and look at them through many different semantic lenses way quicker than I could do alone.
Chat gpt is really good when you forget what something is called or want help with syntax in a programming language you aren't familiar with, but if you ask it to write an entire script there's a good chance it will completely fail.
As a junior software developer, I have gone through a few phases with AI: from sceptical, to using it sometimes, to now using an editor with built-in AI. I definetely work faster using it, but that is because most things I ask I could've written myself or can at least understand. This ability to filter correct and incorrect assumptions is crucial, sometimes it gets it correct first time, sometimes it needs a few new prompts, sometimes I need to tweak it a bit afterwards and a fair few times it is plain useless. I must say I like working with AI now, precisely because I know when and how to use it and because I now have a feeling for its limitations.
Just be careful with that as a junior developer and make sure you are actually learning things. Tools like co-pilot have a tendency to take away your need to think. Try turning it off for a day and see if you can still code. If AI does start replacing software engineer jobs, you don't want to be the SE that is only as good as co-pilot.
I have a rule that if LLM is not getting the right answer first time, either the question is not correct, or there is no enough data to give a better answer. So I need to either change my prompt or break it down to more manageable chunks.
@@ivanjermakov yeah... except sometimes there just isn't an answer. What you are missing is, say in a code question - the mistakes are not necessarily in the response - but in the code it generated to give the response. That's not part of the question, and asking the question differently should not give a more correct answer. Now - you can ask it to re-evaluate the answer, and sometimes it will improve - but if it does not have training data for the answer - it will just hallucinate. No way around that.
@@zoeherriot can't wait for LLMs to have confidence factor so that they can say "I don't know".
@ this will be hugely useful. :)
AI is great at answering all the questions you already know the answer too or could trivially look up the answer.
you're gonna love this en.wikipedia.org/wiki/Michael_Crichton#Gell-Mann_amnesia_effect
thank you as always for being consistent, concise, and clear.
My pleasure! And thank you :-)
1:55 Kind of, but historically we have been quite restrictive with the use of the term AI, until of course the recent AI boom.
Most of these were referred to with a more accurate description, predictive text, machine learning, ect, even while they were being developed. Using AI to describe these systems is a recent and inaccurate phenomena, largely retroactively applied after the hype and marketing about their spiritual and technical successors in the recent "AI" boom.
Deckard was never confirmed to be a replicant.
Someone's not seen the director's cut ;-)
@@NoBoilerplate Where is it confirmed in the Director's Cut? The Director's Cut introduced leaves more room for interpretation as to whether Deckard is a Replicant than the theatrical version. And the final cut makes it even more explicit, but still never really confirms it.
Deckard not being a replicant is boring, Deckard being a reicant isn't boring. Err on the side of the not boring. And it's pretty obviously the authorial intent.
@@deadlightdotnet I disagree that Deckard has to be a replicant AND that it's boring if he's not. To me, the point of the Director's cut is that he _may as well_ be a replicant, and that's the point I find most interesting.
@adamwells9352 if you take all of the available evidence then he's clearly a replicant. It's not debatable.
I'm of two minds here.
On one hand, I agree with almost everything you say. I'm sure the stakeholder hype train you describe is real, and that many big promises are made that can't be delivered on. I have seen how the training data sparsity is a major factor in accuracy in certain niches. It is well established that LLMs are effectively the world's most expensive, fancy "predict the next word" machines, which does not a priori make them great "understand and reason through this problem" machines or even mediocre "do what's in the best interest of my company/consumer/citizens" machines.
However, I'd argue that this is not completely sound as reasoning for why AI is unlikely to grow and solve more complex problems than the world's most advanced parrot. There's something in AI safety research called the "orthogonality thesis" which claims an agent can in principle have arbitrarily high intelligence in the pursuit of any goal. That is, a tool for any purpose can be very very smart. Even if some future version of GPT's only goal is to predict the next word in the sentence, if the model is smart enough, it may in fact be able to do advanced, layered reasoning under the hood to bring you the next word. For example in math, most complicated problems can be broken down into a sequence of simple problems. If gpt knows the simple answers from training data, and has been trained on a sufficient volume of mathematical analyses to be able to mimic stitching these simple answers together effectively, it may be able to construct the complicated answers. If that's the case, then the collection A of answers it can give is not limited to the collection D of data it has been trained on, but on some construction R(D) of answers it can construct by combining answers from D. In principle, R(D) can be much, much larger than D, and may entirely bypass your state space sparsity argument. To what extent my reasoning holds in practice is not a trivial question to answer, but I would argue we see inklings that it already does hold true at a small scale. I have seen LLMs struggle with some questions when asked outright, but then perform much better when prompted with a chain of thought approach.
i think the real bad case against current AI is not even the fact certain knowledge is obscure, at some point it might just get developed enough to say "i don't know that"
the issue is that Language is not everything, there are LOTS of things that are know. but are poorly expressed thru language.
a big example, posing. we have no language for posing. some poses get names, like the super hero landing or the Marilyn Monroe.
but a photographer work is literally a set commands "raise you arm", "turn around", "look up" is just a bunch of very vague instructions.
and the more you try to be specific, the more you end with deep fried outputs.
道可道,非常道。
名可名,非常名。
Shows how much structure and information our language in itself contains.
HUGE amounts! But you know what else contains structure and information? Structure and information 😉
LLMs are fantastic at inter-language communication (with the caveat of "it has to have the input/output languages in its repertoire") and info-gathering, a nice presentation about that can be found under this name:
"Beyond the Hype: A Realistic Look at Large Language Models • Jodie Burchell • GOTO 2024"
As the name "Large Language Model" may imply, it is a tool for crunching through large quantities of language data.
They are good for crunching patterns within the language data, usually good at presenting the results, and as long as the written languages are within their repertoire - usually language-agnostic about where those patterns came from.
I can see the comments also point out the finer points of "info-gathering", i.e. good at turning vague descriptions into a more explicit What To Search for, as a way to find data sources on a specific subject, likely many more fun use-cases.
But yes, the important thing is What Are The Tools good for, not the weird presented pie-in-the-sky scenarios.
These things definitely have their uses, and it's not just being a fancy chatbot.
absolutely!
"Pensive blues"... "Hopes echo"
You're right, that's golden thesaurus-ness
Great point which I'm glad you are publicizing. A corollary, 'AI' is pretty useful at basic tasks right now (like writing simple code, tedious config boilerplate, finding information, proofreading, etc.), but the excitement largely is not around improving the interface to make those affinities more useful to the end-user - it is instead pursuing so called 'AGI' which is a fundamental breakthrough and by all means should be more exciting but far more difficult (not to mention the extremely nebulous definition which plays right into your point ['AI' is the ultimate investor bait]).
There are some examples: Blender and Adobe integrating 'AI' into the workflow. Often when using programs like that I am asking an LLM for micro-tutorials along the way - skipping that step and just letting it do that task on the file directly is fantastic, and will be the genuine way to create near-term value for users and companies.
They really don't need large amount of text covering a topic to learn. Example: Claude knows a lot about myself, just from training on Reddit comments. It could answer how did I interact with specific other user - when I had just a few interactions with them.
Predicting arbitrary text is not just mastery of language. It's not a narrow task, at all. Our brain works by predicting our inputs as well.
From Gwern's "The Scaling Hypothesis" (...I suck at pruning stuff to excerpt, so it's kinda long):
> Humans, one might say, are the cyanobacteria of AI: we constantly emit large amounts of structured data, which implicitly rely on logic, causality, object permanence, history-all of that good stuff. All of that is implicit and encoded into our writings and videos and ‘data exhaust’. A model learning to predict must learn to understand all of that to get the best performance; as it predicts the easy things which are mere statistical pattern-matching, what’s left are the hard things.
(...)
> once a model has learned a good English vocabulary and correct formatting/spelling, what’s next? There’s not much juice left in predicting within-words. The next thing is picking up associations among words. What words tend to come first? What words ‘cluster’ and are often used nearby each other? Nautical terms tend to get used a lot with each other in sea stories, and likewise Bible passages, or American history Wikipedia article, and so on.
> Now training is hard. Even subtler aspects of language must be modeled, such as keeping pronouns consistent. This is hard in part because the model’s errors are becoming rare, and because the relevant pieces of text are increasingly distant and ‘long-range’. As it makes progress, the absolute size of errors shrinks dramatically.
> (...) as training continues, these problems and more, like imitating genres, get solved, and eventually at a loss of 1-2, we will finally get samples that sound human-at least, for a few sentences. These final samples may convince us briefly, but, aside from issues like repetition loops, even with good samples, the errors accumulate: a sample will state that someone is “alive” and then 10 sentences later, use the word “dead”, or it will digress into an irrelevant argument instead of the expected next argument, or someone will do something physically improbable, or it may just continue for a while without seeming to get anywhere.
> All of these errors are far less than 0.4?
> Well-everything! Everything that the model misses. While just babbling random words was good enough at the beginning, **at the end, it needs to be able to reason our way through the most difficult textual scenarios requiring causality or commonsense reasoning.** Every error where the model predicts that ice cream put in a freezer will “melt” rather than “freeze”, every case where the model can’t keep straight whether a person is alive or dead, every time that the model chooses a word that doesn’t help build somehow towards the ultimate conclusion of an ‘essay’, **every time that it lacks the theory of mind to compress novel scenes describing the Machiavellian scheming of a dozen individuals at dinner jockeying for power as they talk, every use of logic or abstraction or instructions** or Q&A where the model is befuddled and needs more bits to cover up for its mistake where a human would think, understand, and predict.
> For a language model, **the truth is that which keeps on predicting well-because truth is one and error many. Each of these cognitive breakthroughs allows ever so slightly better prediction of a few relevant texts; nothing less than true understanding will suffice for ideal prediction.**
> **If we trained a model which reached that loss of The last bits are deepest. The implication here is that the final few bits are the most valuable bits, which require the most of what we think of as intelligence. A helpful analogy here might be our actions: for the most part, all humans execute actions equally well. We all pick up a tea mug without dropping, and can lift our legs to walk down thousands of steps without falling even once. For everyday actions (the sort which make up most of a corpus), anybody, of any intelligence, can get enough practice & feedback to do them quite well. Meanwhile for rare problems, there may be too few instances to do any better than memorize the answer.
> **Where individuals differ is when they start running into the long tail of novel choices, rare choices, choices that take seconds but unfold over a lifetime, choices where we will never get any feedback** (like after our death). One only has to make a single bad decision, out of a lifetime of millions of discrete decisions, to wind up in jail or dead. **A small absolute average improvement in decision quality, if it is in those decisions, may be far more important than its quantity indicates, and give us some intutition for why those last bits are the hardest/deepest.** (Why do humans have such large brains, when animals like chimpanzees do so many ordinary activities seemingly as well with a fraction of the expense? Why is language worthwhile? Perhaps because of considerations like these. We may be at our most human while filling out the paperwork for life insurance.)
I didn't understand the ending, but it seems Pretty good at the middle. Nice text!
This text seems to confirm the poster claim. LLMs learn to master the language. They struggle to learn the semantics of the words they use. In order to mock that, they need human feedback and learn what combination of words is nonsense and what not, but they cannot distinguish the feasibility of situations not already encountered during their training.
The reason why AI took off so abruptly is never-before seen of language "understanding". It's a shift in paradigm - conversations like that was truly unimaginable and science fiction 5 years ago. It mimicks humans surprisingly well. No matter how much nonsense I write, ChatGPT ALWAYS know how to respond and frame it.
But this shocking shift in paradigm doesn't translate that much to actual usefulness. There is an alternative universe somewhere in which Cleverbot is based on GPT and used by kids for fun at sleepovers.
Good video, thanks for your effort.
My thoughts:
1. Don’t confuse marketing for practicality, Steve Jobs did this with his vision for Apple, which came to fruition (lol).
2. Language models only work as well as the user can articulate themselves and define prompts with specificity (you have to know exactly what you need from it and be able to communicate the necessary context).
3. A.I will not think for you, yet. This is why it seems underwhelming to people, because they are seeking it to fill the void of “god”, an entity that will do life for them.
4. This is the “iPhone 1” of AI, expecting holographic waifus that validate your unresolved issues is asking too much… for now. it can simulate what that is, but it’s our nature as humans to want it to actually be it.
5. Marketing, full circle, early adopters understand the utility and place AI currently has, and the seemingly underwhelming version is being pushed to the masses to get them used to it. If you dropped deus ex Machina on the world right now, people would lose there shit (e.g. giving an AI all of your language, visual, biological and information data and it telling you to ask the qt out because your peepee fluctuated, I mean, your heart rate increased…)
So…
1. AI improves technological and scientific advancement.
2. AI is fed the new data.
3. AI is now nursing your newborn and breastfeeding *you*
All hail the omnissiah
AI models have already gobbled all data the humanity has generated throughout the entirety of our history. There is no data left to gobble up, and the new data we generate each day contains more and more AI slop, so it's increasingly lower quality and more tainted.
So we already see AI models quickly plateauing and there's no firm reason to think they will return to exponential growth.
And our chips are plateauing as well, so on the hardware lev we won't see the kind of rapid exponential progress we saw in the 90s as well
Next jumps depend on some unknown discoveries happening, and for now this is just pure belief, extrapolating recent rapid progress into the future just because it feels like it should be that way
"Hey chatGPT, I need a C++ library that does triangulation without earcutting" (I wanted something that would do delaunay triangulation or similar, like poly2tri but more robust/stable, due to some constraints I needed with vertex ordering which the earcutting technique does not enforce)
>chatGPT proceeds to recommend me mapbox's earcut library
bruh
it's RIGHT THERE IN THE NAME.
I disagree, and I think it's reductive to say that they're just fancy autocomplete.
Just yesterday I was having a coding problem I'd been struggling with for a couple of days. I had some code, but it was taking about 10 minutes to run. I didn't know how to make it faster.
I pasted my code into o1 and asked it how I could make it faster (it was essentially how to optimise a deformation field to make one mask align to have the same shape as another).
It thought for 9 seconds, and identified the exact issue causing the slow execution speed, and re-wrote the entire class to use a differentiable cost function, calculated the partial derivatives, and I could just paste it into my program and it immediately got 100x faster and worked perfectly with no modifications. Previous models including o1 preview couldn't do that.
Not to mention you can give it those PhD papers that have been written to give context to what you're asking it about your specific topic.
AI is incredibly useful, and I've used it to save hours and hours of work.
All you are telling us here is that AI can remember textbook algorithms better than you can. That is true, but if the answer is not in a textbook, then you are on your own. Even worse, what if virtually all textbooks contain the wrong answer? There are examples of that. Quantum mechanics, for instance, suffers from a teaching crisis because all textbook authors are using a "shut up and calculate" strategy for their undergrad textbooks. I asked several AIs to explain QM to me. All I evet got was shut up and calculate, no matter how false it was within the scope of my precise questions after the deeper physical reasons for WHY quantum mechanics is structured the way it is. That is simply not in the textbooks. It is in a few papers that almost nobody has ever read, but the AI can't know that these few examples of papers are far more important than a hundred textbooks. And then there is one specific question that is trivial (any undergrad can answer it if you point it out to them WHY it is an important question) but the answer is not contained in any book or paper I have ever seen.
Tremendously useful, with very foundational limits. Keep using these tools and find the edges of their ability, then remember that you're only seeing this limit because you are a domain expert. en.wikipedia.org/wiki/Michael_Crichton#Gell-Mann_amnesia_effect
It still should be called advanced pattern matching. Not AI
In Claude's defense at 2:33, that wasn't even a real question. If I had designed an AI that had been "asked" that, it's response would be to ask the user why he doesn't go learn how the f*** to talk.
@@matthewexline6589 false. How is it not a real question. Do better
@@andrewferguson6901 It has some poor grammar for a question. The first "the" should be deleted and it should end in a question mark instead of a period. But a smart autocorrect could probably get the gist anyway.
Agreed Completely.
Never buy promises. But facts.
Value what exists today, not what will (maybe) exist tomorow(never).
LLMs are like librarians that have read every book in the library. Excellent for summarizing broad swaths of information but you shouldn't trust the librarian to build your Linux system when they're telling you to run `chmod 777 /`
Ironically, having to fix and overcome the problems caused by blindly listening to the LLM has actually made me much more proficient as a user so LLMs are a great way to blindly charge into something that one would otherwise be anxious to start
The irony of getting the ad "This AI toll will create a stunning E-book" 🤣🤣
Yesterday, Copilot couldn't tell me how many total votes Trump got in the recent US election. Fair enough if it doesn't know yet, but It told me it was 'a nuanced question'.
The "nuanced" part is probably artificial guardrail detritus introduced by humans. But they're not usually great at numbers and didn't used to be able to browse the Internet for info not in their training data. Now many of them can but maybe not all of them.
Grok is far superior when it comes to questions about live or recent events. I use it mostly for keyboard fights lol
It's interesting, ChatGPT can list a whole bunch of contributions to society from African countries, but European contributions are nuanced and not important to think about.
@@snorman1911lol. ChatGPT is designed to tell people what they want to hear. The guard rails are added to sometimes not tell people what they want to hear. The human gets upset. But the human asked because it was looking for something to be upset about. Mission accomplished?
Copilot is the worst… Microsoft can’t keep themselves from constantly nannying users and substituting their judgment for your own.
Thank you for this! I needed a sanity check amidst Sora's release.
The promise is what the whole stock market is build on, which is why we get bubbles.
If only there was a solution! ;-)
@@NoBoilerplate well, the solution people seem to have come up with: in this era of global finance, they buy houses and prices went up everywhere, including for example Africa. Clearly not better.
oh, I was alluding to socialism, democracy instead of money. Quite the challenge, of course, but maybe given the news this week things are changing...
@@NoBoilerplate I mean, I'm talking about what people are doing, you are talking about what people maybe should be doing.
That graph makes me think of what Rick said about using the Meeseeks box. "Keep it simple they're not (burp) gods."
Not gonna lie, as someone about to graduate with a CS degree, the progress that AI is making with programming is destroying my hope for the future and mental health. I really hope this plateaus, or my last decade of passion for programming will be worthless
For what its worth, AI is a tool, that tool will have to be used by people and the better you know the workings of that tool (AI is just a computer program, complex, but still programming) the better the worker can be. Don't give up on CS, if anything dive deeper into it. Everyone will begin by implementing half baked AI solutions that junior developers have knocked together and then without doubt I think we will see a resurgence for the need for people who can actually develop because the solutions will be such a mess that even the AI trips itself up. The future will need people with deep CS knowledge. But for argument sake lets say AI gets so good it eliminates the need for every software developer, well I'd say that would mean its just eliminated the need for most jobs at that point, every manager, accountant, ceo, cfo etc etc. so the world will have to change not just your degree.
It's *so* bad at anything other than boilerplate code, don't even worry. The rule I explained in this video perfectly explains it: There's loads of basic code for it to learn from, and almost no advanced code. Use copilot to speed you up for sure, but don't lose any sleep over it :-)
Really nice take - it doesn't take long to leapfrog what AI can do. And what luck, you aren't replaced by copilot, there still need to be junior coders, right! Check my "Renaissance" video for more of my thoughts here
Samesies. I keep telling myself "well it's not like computers are going to be less important". I try to spend most of my independent research effort on understanding / keeping up with AI theory and use, to try to stay ahead of the curve. If/When things get worse, maybe at least I can run the Wendy's because I can work with the Wendy's AI better than all the psych majors.
Ai generation of code is improving fast. I still find them best for buddy coding. At least for now, you need someone who can code to get them to collaborate in generating high quality code.
"Like a sociopath getting under your skin by saying what it thinks you want to hear"
Not "like". LLMs like ChatGPT are trained to produce responses that are more preferable to human raters, so they are in fact LITERALLY doing their best to say what they think you want to hear. Sometimes this happens to align with telling you the truth (or its best guess), but if you are looking to be deluded it will happily oblige. It is surprisingly difficult to get these chatbots to respond with a hard disagreement, barring censored/sensitive topics where they've had prescribed answers beaten into them.
No Boilerplate 🤝 Brennan Lee Mulligan
Capitalism is the bad guy
(Taking a slight bet that you get this reference Tris, but I think I have good odds lol)
oh, no, you've got it exactly. Brennan is my inner voice
@NoBoilerplate lol, definitely relate to that
Unrelated to this comment and semi-related to the video, have you read/listened to Hannah Fry's book about ML/AI "Hello World"? It's a very very good history on machine learning, talking about how it works, where it goes right/wrong, how humans and AI work well together, etc. It predates the rise of ChatGPT, but I still think its philosophical stance is something I agree with a lot, especially these days. (If you're very much in the world of AI, eg you do it for a job or research, the actual facts won't be that new to you though, it's not a deep dive since it's meant to be approachable to a general audience). Maybe worth checking out if you haven't, I highly recommend it!
5:47 It was realizing this that made me understand why the art industry feels so threatened by generative AI.
It's not that generative AI can do a good job embodying visual artistic expression, especially good enough to genuinely replace artists - because LLMs are the underlying technology, they can only learn techniques that they have a lot of training data on. But the _investors_ who decide what projects get money and where that money goes are making inappropriate decisions based on the promises they're being fed about AI. And that's harmful to artists whether or not the AI does a good job replacing them or not.
Today tried to force it to give me anything about webauthn stuff in rust and it gave me halucinated responses 100% of the time :d
You know that guy who likes to seem knowledgable, and always has an answer for everything, even if they have to invent one? #GPT
@@NoBoilerplate The world slumbers when 0 = 0, but when 0 = 1, everyone loses their shit.
I believe the problem of hallucinations is because chatgpt isn't actually an "AI assistant", it's merely simulating one. At the beginning of each chat it's explicitly told "You are 'a smart AI assistant'. Answer all of the user's queries following these rules: . Here's the user's query: ..." and then tries to answer what it _thinks_ a smart AI assistant would answer. Obviously it can't perfectly mimic an AI assistant because its knowledge is limited. It's like asking a friend "hey, act like you're a professional electrical engineer and try to answer all my questions". If you get a "hallucinated" answer, it's not because your friend is confident in their answer and just acting weird, but because they think that such an answer sounds _close enough_ for a professional electrical engineer to say.
It's so wild there aren't as many videos out therr about this; I completely agree with everything that you said reg accuracy vs complexity. My business has wasted so much time and money trying to shoehorn AI into our workflow. One thing though... "Pay less attention to what these companies promise for the future and pay more attention to what they actually do in the present." The problem with that is that we live in a day and age where fake demos/product showcases and fake testimonials are so common. I don't remember this being such a problem in the past but now it's almost every launch of something.
I had a hilarious moment a few days ago... I'd just had a discussion with some colleagues where we all basically agreed that AI as it stands isn't useless per se... it's just probably not making back the amount that is being paid for it (in power and dev time if nothing else). The next day I was poking around on a Microsoft site and there was this hilarious photo declaring that AI was here and someone was using it to ask "what celeb does my dog look like". I had to post it to them with the caption "Guys guys, Microsoft has the answer!"
As an AI nerd from way back, the idea that Copilot could come back with an answer to that question (and I am assuming that it did) that is anywhere better than "completely non-sequitur" is nothing short of incredible, like literally answering all of the incredulous questioning that people of even 5 years ago had about AI. As a business software developer in current year, I question how it's going to make back the billions of dollars that it cost to train.
the sad part is they did it! GPT is great at natural language processing! The solution isn't to pour money into more complex models (geometric complexity has diminishing returns), but to start building tools that use GPT for the language processing part.
But that's not as sexy for investors 🙃
@@NoBoilerplate Just as blockchain was before it really (although for very different reasons).
FWIW, in Dream On, the harpsichord is doubled by an electric guitar, as implied by "blending classical and rock elements" but weirdly left unstated.