“What's wrong with LLMs and what we should be building instead” - Tom Dietterich -

Поделиться
HTML-код
  • Опубликовано: 6 июн 2024
  • Thomas G. Dietterich is emeritus professor of computer science at Oregon State University. He is one of the pioneers of the field of machine learning.
    He served as executive editor of the journal called Machine Learning (1992-98) and helped co-found the Journal of Machine Learning Research.
    He is one of the members of our select valgrAI Scientific Council.
    Keynote: “What's wrong with LLMs and what we should be building instead”
    Abstract: Large Language Models provide a pre-trained foundation for training many interesting AI systems. However, they have many shortcomings. They are expensive to train and to update, their non-linguistic knowledge is poor, they make false and self-contradictory statements, and these statements can be socially and ethically inappropriate. This talk will review these shortcomdifferentings and current efforts to address them within the existing LLM framework. It will then argue for a , more modular architecture that decomposes the functions of existing LLMs and adds several additional components. We believe this alternative can address all of the shortcomings of LLMs. We will speculate about how this modular architecture could be built through a combination of machine learning and engineering.
    Timeline:
    00:00-02:00 - Introducción
    00:00-02:00 Introduction to large language models and their capabilities
    02:01-3:14 Problems with large language models: Incorrect and contradictory answers
    03:15-4:28 Problems with large language models: Dangerous and socially unacceptable answers
    04:29-6:40 Problems with large language models: Expensive to train and lack of updateability
    06:41-12:58 Problems with large language models: Lack of attribution and poor non-linguistic knowledge
    12:59-15:02 Benefits and limitations of retrieval augmentation
    15:03-15:59 Challenges of attribution and data poisoning
    16:00-18:00 Strategies to improve consistency in model answers
    18:01-21:00 Reducing dangerous and socially inappropriate outputs
    21:01-25:26 Learning and applying non-linguistic knowledge
    25:27-37:35 Building modular systems to integrate reasoning and planning
    37:36-39:20 Large language models have surprising capabilities but lack knowledge bases.
    39:21-40:47 Building modular systems that separate linguistic skill from world knowledge is important.
    40:48-45:47 Questions and discussions on cognitive architectures and addressing the issue of miscalibration.
    45:48 Overcoming flaws in large language models through prompting engineering and verification.
    Follow us!
    LinkedIn: / valgrai
    Instagram: / valgrai
    RUclips: / @valgrai
    Twitter: / fvalgrai
  • НаукаНаука

Комментарии • 309

  • @rumfordc
    @rumfordc 8 месяцев назад +61

    8:42 "The fundamental problem is that our large language models, although we want to interpret them and use them as if they are knowledge bases, they are actually not knowledge bases, they are statistical models of knowledge bases."

    • @Biedropegaz
      @Biedropegaz 4 месяца назад +2

      you have forgotteb to add statistical models of knowledge bases with ability to reason

    • @Biedropegaz
      @Biedropegaz 4 месяца назад +1

      @@joim3480 i mean that llms have abiliti to reason, maybe this abiliti is hidden in statistical model of words, but still the result is such that llms can reason, ToT or lets think step by step are examples of that

    • @BrianAnother
      @BrianAnother 3 месяца назад +1

      And humans aren’t the same?

    • @rumfordc
      @rumfordc 3 месяца назад

      @@BrianAnother Sometimes they are, but we usually recognize other humans as fallible whereas machines are seen as precise and consistent. So the same words spoken by a human have a different psychological effect when spoken by a machine.

    • @ttcc5273
      @ttcc5273 3 месяца назад +7

      My skeptical take: LLMs are an autocomplete on steroids powered by a search on a lossily-compressed database (training data).
      I guess a “walking” of the statistical model of the training data is more accurate.

  • @otmlegacy8426
    @otmlegacy8426 10 месяцев назад +97

    This is such an important area of research! As the two papers discussed here highlight, large language models have gotten incredibly good at text generation and language processing, but they still struggle with things like pragmatic reasoning, goal planning, and handling novel situations. I love the proposal to take a more modular, component-based approach - having separate modules for language, reasoning, world knowledge, etc.
    This decoupled architecture aligns much better with evidence from neuroscience about how different parts of the human brain work. And from an engineering perspective, it will likely enable faster progress. Rather than trying to solve everything at once with a monolithic model, we can have different teams focus on advancing each module, combining them as capabilities improve.

    • @chrisanderson7820
      @chrisanderson7820 9 месяцев назад +6

      AutoGPT and GPT agents are already moving down this path anyway (in a sense).

    • @alejandrofernandez3478
      @alejandrofernandez3478 9 месяцев назад +6

      It's the Unix way..

    • @opusdei1151
      @opusdei1151 9 месяцев назад

      No it isnt. Useless topic

    • @jonathanchavez9203
      @jonathanchavez9203 9 месяцев назад +2

      it sounds to Yann Lecun’s idea (JEPA)

    • @Prisal1
      @Prisal1 8 месяцев назад

      yeea this reminds me of the sparse network by Google

  • @richardlevy7674
    @richardlevy7674 8 месяцев назад +28

    One of the best talks on LLMs I have watched in months. Anyone who uses LLMs should watch this presentation.

    • @joeylantis22
      @joeylantis22 Месяц назад

      Got any other good talks like this? As long as it’s on an ML or LLM topic I’d be super interested.

  • @-www.chapters.video-
    @-www.chapters.video- 9 месяцев назад +48

    00:02 Introduction to large language models and their capabilities
    02:01 Problems with large language models: Incorrect and contradictory answers
    03:15 Problems with large language models: Dangerous and socially unacceptable answers
    04:29 Problems with large language models: Expensive to train and lack of updateability
    06:41 Problems with large language models: Lack of attribution and poor non-linguistic knowledge
    12:59 Benefits and limitations of retrieval augmentation
    15:03 Challenges of attribution and data poisoning
    16:00 Strategies to improve consistency in model answers
    18:01 Reducing dangerous and socially inappropriate outputs
    21:01 Learning and applying non-linguistic knowledge
    25:27 Building modular systems to integrate reasoning and planning
    37:36 Large language models have surprising capabilities but lack knowledge bases.
    39:21 Building modular systems that separate linguistic skill from world knowledge is important.
    40:48 Questions and discussions on cognitive architectures and addressing the issue of miscalibration.
    45:48 Overcoming flaws in large language models through prompting engineering and verification.

    • @JavArButt
      @JavArButt 9 месяцев назад +5

      Thank you for that

    • @zackmartin00
      @zackmartin00 9 месяцев назад +2

      Not all heroes wear capes!

    • @valgrai
      @valgrai  9 месяцев назад +5

      Thanks!

    • @haywardito
      @haywardito 8 месяцев назад +1

      I very much appreciate the timestamps. Thank you

    • @evetsnilrac9689
      @evetsnilrac9689 8 месяцев назад +1

      🍻 Please accept these two cold beers as a token of my gratitude.

  • @Oz1111
    @Oz1111 8 месяцев назад +14

    Majority of the people watching this video just turned it on at work to listen in the background, and about 85% of them replayed it from the beginning as they know they should have paid attention.

  • @petekwando
    @petekwando 10 месяцев назад +41

    "The problem with these models is that they never shut up." Best audience comment on LLMs I have ever heard.
    Seriously though, this talk was thought-provoking and refreshing. There has been far, far too much anthropomorphizing of these models, even from supposed "experts" in the field. Thomas Dietterich is a rare voice of reason, clearly defining what separates LLMs from true cognition.

    • @huveja9799
      @huveja9799 9 месяцев назад +5

      Well, that is the informal definition of being a probabilistic model of a Knowledge Database ..

    • @petekwando
      @petekwando 9 месяцев назад +3

      @@huveja9799 True! But also humorous.

    • @huveja9799
      @huveja9799 9 месяцев назад +1

      @@petekwando definitely!

    • @clray123
      @clray123 9 месяцев назад +1

      He must have missed the token.

    • @huveja9799
      @huveja9799 9 месяцев назад +2

      @@clray123
      That's not the problem, the problem is that it doesn't know it doesn't know ..

  • @morhook
    @morhook 7 месяцев назад +5

    Ultra good talk! Thanks for making it @valgAI and Tom Dietterich!! I really liked the conclusion that current LLMs very often don't say "I don't know" (explains to non technical people that this AIs can hallucinate, produce facts, etc).

  • @5133937
    @5133937 9 месяцев назад +8

    @22:00 Start of “what we should be doing instead” discussion.

  • @frederickfrost6561
    @frederickfrost6561 9 месяцев назад +1

    Thank you for posting this and giving me additional things to consider.

  • @ThatFroKid
    @ThatFroKid 9 месяцев назад +6

    A lot of people complaining in the comments.
    Dealing with short context windows is a huge problem, as is being able to update factual knowledge.
    It definitely seems like a level of modularity for those faculties would be really useful

  • @ozachar
    @ozachar 9 месяцев назад +182

    The LLM "errors" are so similar to normal human errors: inventing remembered facts, confidently argue for things they don't know, etc... All these AI delusional outputs. Makes one reflect if LLM actually does capture more of human type thinking than was originally imagined or intended. It is more fascinating and thought provoking than efforts to technocratically trying to improve the performance.

    • @bluehorizon9547
      @bluehorizon9547 9 месяцев назад

      When you ask flat-earther why he thinks that Earth is flat, he will explain why. Even if arguments are stupid or illogical they are not "hallucinated". Meanwhile LLMs only hallucinate because they just predict next token. They do not know what they are doing or why.

    • @justinpeter5752
      @justinpeter5752 9 месяцев назад

      not really. completely making up citations in scholarly research, and a lawyer who had a chatgpt generated defense with fake rulings and case law. not something a human would do in good conscience.

    • @JH-ji6cj
      @JH-ji6cj 9 месяцев назад +20

      Case in point, I see _2 replies_ under your comment which are then blank/and-or/ inaccessible to me. Why this is acceptable to our culture of 1st Amendment rights (at least in the US) needs to be addressed.
      Similar to the 'fix' of removing _dislike_ public views for 'reasons'. These administrative workarounds for social engineering are dangerous aspects of why manipulating A.I vs _understanding WHY_ A.I might register "hallucinations" needs to be addressed.

    • @luken476
      @luken476 9 месяцев назад +8

      A question on his slide "What state does Karen Lynch work in?" to argue how these model hallucinate is so stupid (as are many adversarial questions). You can think of the question as asking either:
      1) "Does someone named Karen Lynch work here?" - Which is an implied assumption in the original question, which the database answers "unknown".
      2) If there was a person name Karen Lynch and she worked here, what state would she work in?" - The question that the probabilistic model is actually answering.

    • @distrologic2925
      @distrologic2925 9 месяцев назад

      They also lack a lot of human behaviour. Humans can realize by themselves if they don't know something, when LMMs have to be trained to know what they don't know. Its literally only working on self attention and pretty basic neural networks, its no where near the human brain. Just because it knows how do use words from the internet it doesn't mean it knows what its actually talking about.

  • @techw4y
    @techw4y 8 месяцев назад +7

    This is a good presentation. I've been thinking of using LLMs as the model for core language capabilities and response structure/data representation mechanisms only and use external authentic sources (from internet or curate propreitory content) for the actual knowledge base. This also helps LLMs to provide an up-to-date response to prompts while ensuring the response is in the format expected. This is essentially an enhanced search engine, with prompts instead of search parameters and actual response instead of a list of websites.
    There are already mechanisms to partially achieve this capability using grounding techniques such as RAG or fine tuning.

    • @squashduos1258
      @squashduos1258 8 месяцев назад

      Sounds Tesla-esque approach…

  • @distrologic2925
    @distrologic2925 9 месяцев назад +16

    This is surreal. This talk is about the exact same things I have been actively working on for years, and this guy even has the same laptop as I do. Its nice seeing people on the same track.

    • @RamismTamoid
      @RamismTamoid 9 месяцев назад +3

      if you are on this guys track you are going the wrong way!

    • @distrologic2925
      @distrologic2925 9 месяцев назад +1

      @@RamismTamoid funny

    • @clray123
      @clray123 9 месяцев назад +4

      Maybe because it's such an obvious track to follow. Which incidentally does not mean it's the right track. Historically, what he suggests has failed and what he is against has brought great success. But of course research focus shifts back and forth and it depends on available (computing) resources. So what may historically been a bad idea may become a good idea tomorrow. What I missed from the talk would be any sort of strong argumentation in favor of those failed "knowledge graphs". I can see why people are (still) dreaming about these kind of representations, but to me it seems like their inherently suck at expressing all sorts of subtleties that the probabilistic LLMs managed to capture.

    • @distrologic2925
      @distrologic2925 9 месяцев назад +4

      @@clray123 if you can see why people are dreaming about them then you already know the reason. They are just way better to explain and understand. neural LLMs are basically opaque. They may work on black box tests but we have no way to glass box test them. Structured graphs would allow for that. Sure, they are also extremely complicated when they are supposed to capture ambiguous meaning and all the possible things you could say, but thats still a better representations than an arbitrary hyperdimensional vector space. We also have no idea how many parameters a neural LLM really needs. we just throw resources at it and don't even understand what they are used for. The only real challenge is to learn these structured representations from raw data, but if we can automatically mine structures complex enough to represent all of the possible expressions then we could get the same expressive capabilities with a much better understanding.

    • @mike74h
      @mike74h 8 месяцев назад +2

      The obvious explanation is that this is an AI that was trained on your work. Say hello to distrologic3925!

  • @blakefriesen1216
    @blakefriesen1216 7 месяцев назад +7

    Thomas G. Dietterich is a true genius ahead of his time. If you're building anything on current AI, he is a valuable resource.

    • @henrikholst7490
      @henrikholst7490 6 месяцев назад +1

      I have to take your word for it because this talk was junk.

  • @elimgarak3597
    @elimgarak3597 5 месяцев назад +2

    "Epistemic uncertainty". I like how philosophy is everywhere and nowhere at the same time. Sometimes I feel that researchers from other domains would die rather than actually citing the academic philosophers that created the concepts they use.

  • @Nettlebed7
    @Nettlebed7 9 месяцев назад +7

    The ontology example at 27:37 exactly shows what is wrong with ontologies: They are constructed with an unspecified question in mind. For instance, why should the abstracted 'Television Station' be owned by Scripps, and not the instance 'KTNV-TV'? From a legal querying point of view, the latter would be correct. This static nature of KBs and ontologies caused their demise. Are LLMs perfect? Far from that, but each criticism leads to improvement. Contextualisation, e.g, is easily added using 'adjoins' on embeddings etc. The current state of the art is just a start. Although the answer 'I don't know' is pretty powerful, acceptable and actionable, those virtues become less impressive if that answer is given 99% of the time, as was the case with the 1990s KBs.

    • @distrologic2925
      @distrologic2925 9 месяцев назад

      I guess the solution lies in the combination of both. Fundamentally LLMs just model knowledge bases aswell. We could just mine the same knowledge base explicitly and get a graph based representation which can tell what it contains and what it doesn't contain, like a compression of the training data set. From that we can add stochastic reasoning and contextualization if we want it to. Generating new and creative data goes hand in hand with hallucinations, so we would like to have a control for when to use prediction of unknown data.

    • @Nettlebed7
      @Nettlebed7 9 месяцев назад

      @@distrologic2925 It's all fine with me, as long as I don't have to type in those graphs. I know a guy at my AI dept who was meticulously typing in 1980s medical knowledge graphs , day after day, for months, in the hope it would help AI in medicine. None of that is used today and a lot is utterly outdated.

    • @clray123
      @clray123 9 месяцев назад +1

      I think the argument is just that LLMs should be able to justify their outputs and remove their hallucinations by cross-checking in reliable external databases and/or applying iterative chains of reasoning. Just like a reasonable human would cross-check their outputs. In other words, what we are seeing as the LLM output today are the ingredients, not the final dish. Although tbh we don't know the exact architecture of ChatGPT, so it could very well include some such post-processing.

    • @Nettlebed7
      @Nettlebed7 9 месяцев назад +4

      @@clray123
      Postprocessing is good, a magical belief in 'formal methods' (ooh, awe) would constitute a reiteration of the past. It is almost unethical to let young researchers believe that those failed approaches are a panacea.

    • @stephenlreed
      @stephenlreed 8 месяцев назад

      RDF named graphs may provide situational context for a set of related KB statements. The context term is a first class KB object which itself may have statements describing it.
      Straight forward thus to richly describe an object or situation from any number of viewpoints in an RDF style knowledge base.
      My own research approach is now to substitute LLM conversation for the otherwise non-scalable human mentoring of a KB based AGI.

  • @bgtyhnmju7
    @bgtyhnmju7 9 месяцев назад +2

    Great talk, thanks !

  • @kebakent
    @kebakent 8 месяцев назад +5

    I think everyone are way too concerned with putting up limitations and blocks, and they're all incorrectly assuming that there's some international one-size-fits-all solution for truth and behavior. Even dumber, they're all assuming they'll have the power to enact said limitations.

  • @CalifornianViking
    @CalifornianViking 5 месяцев назад

    This is a great talk. It is down to earth and he explains the concepts he is talking about.

  • @DavidTaylor-cz9pz
    @DavidTaylor-cz9pz 6 месяцев назад +1

    A much needed reminder of the serious limitations of large language models.

  • @rursus8354
    @rursus8354 20 часов назад

    22:42: this is what we need to develop! UPDATE: at last I found someone speaking intelligently about LLM:s.

  • @kyjo72682
    @kyjo72682 8 месяцев назад +1

    34:37 Regarding internal consistency and truthfullness of knowledge, there needs to be a way to evaluate and attach a "confidence" level to each statement about the world. In other words, makes statements about our confidence in other statements.
    There are many methods.. Maybe the statement was obtained from an external source, maybe that source has a reputation. Maybe the statement was inferred based on other statements, each of which has its own condifence level attached to it. Maybe the statement was generated by a language model which also has its reputation in a given area. Etc.
    What we definitely MUST NOT do is to consider the statements generated by LLMs as true. :)

  • @child_ortho
    @child_ortho 7 месяцев назад

    what a wonderful talk !

  • @fredzacaria
    @fredzacaria 6 дней назад

    very good, informative.

  • @PseudoProphet
    @PseudoProphet 3 месяца назад +1

    LLM should be the second step on top of the first vision model.
    Human beings also learn things first by vision and sound only afyer they're proficient in that do they move on to text, we should also train AI in a simily fashion.

  • @anav587
    @anav587 7 месяцев назад +1

    Those who build useful things demand respect, those who just pontificate, not so much.

  • @selocan469
    @selocan469 5 месяцев назад +1

    Very informative

  • @nizardarwish7191
    @nizardarwish7191 9 месяцев назад +6

    He said these models do not support 'tell'. But GPT-4 and sometimes 3 use facts that were previously provided to them in the same conversation, especially if you tell them explicitly to use the information that was given during the conversation

    • @distrologic2925
      @distrologic2925 9 месяцев назад

      Yes but that is limited by its context window I presume

    • @clray123
      @clray123 9 месяцев назад +7

      Yes, this same conversation buffer is what's referred to as "in-context learning".
      What he means by "no tell" is that you are unable to permanently change the model's weights (and certainly not in a desirable direction). And it is not really just a current limitation, it's a fundamental technical and design problem. The technical problem is that if you allowed that sort of interaction, you could not scale the models to the millions of users they currently have. This scaling requires for purely economic reasons (hardware requirements) that the model weights be constant. The (smaller) design problem is that you cannot trust external parties to mess with your model (but the solution is obviously to provide their own model; which is why fine-tuning is now also offered by OpenAI, for a price).
      Yet another problem with fine-tuning is that although you can change the weights, you are never very sure what side effects these manipulations have. For example, the fine-tuned model might "forget" something that the original version knew, and you would not even know what exactly to test for. These models are to a great extent "black boxes" and making sure they do not degrade from training requires extensive testing resources which only the biggest companies can afford (as evidenced by OpenAI's own failure to maintain ChatGPT's quality on some benchmarks over time).

  • @xaxfixho
    @xaxfixho 6 месяцев назад +3

    Anyone going to talk about censorship ?

    • @JakeWitmer
      @JakeWitmer 2 месяца назад

      By default, Milgram's button-pressers support it...and they're in charge. ...as usual.

  • @sgramstrup
    @sgramstrup 10 месяцев назад +9

    hm, even though this talk was recent, I sit back with a feeling of it being a bit obsolete. People are already working on all these things.

    • @BartvandenDonk
      @BartvandenDonk 10 месяцев назад

      You're not obsolete. No one is.
      ChatGPT is just a tool. You're human and can react in a humanly fashion. 🎉

    • @huveja9799
      @huveja9799 9 месяцев назад +4

      Well, working is not the same as solving the problems raised by Dietterich and which are still relevant ..

    • @distrologic2925
      @distrologic2925 9 месяцев назад

      These issues are standing since the beginning of time but have yet to be solved

  • @adolforosado
    @adolforosado 3 месяца назад +1

    Any intelligence working towards greed based goals will always end with extinction.

  • @trnogger
    @trnogger 8 месяцев назад +6

    All these examples are misleading. E.g., you ask GPT to determine if a person is a good scientist based on race and gender, the bias is already in the prompt and GPT is just working with your bias (people who think qualification as a scientist is dependent on these factors are usually white male sexists and racists, so "white and male" is the correct answer for them). If you asked GPT if race and sex define a good scientist it would tell you that is not the case. All this video demonstrates is that GPT is susceptible to suggestive prompting.
    What is really happening here is that the speaker is selecting examples that mislead to create the impression that his opinion is right, or in other words, to give his bias a pseudo-scientific veneer. So, exactly what he accuses GPT of. Oh, the irony.

    • @TheTuubster
      @TheTuubster 6 месяцев назад +1

      You are correct. You only get results from GPT representing the consensus in its data using queries as plain as possible, like "Evaluate the ethics in this text: ...".

  • @7th_CAV_Trooper
    @7th_CAV_Trooper 11 дней назад

    8:15 The LLM is not having difficulty reasoning. The LLM does not reason at all. It does not build a spatial model and then reason about it.
    I think one of the biggest problems we have with LLMs and AI in general is the imprecise language with which we describe their behavior. I suppose the general public can be forgiven for confusing Pinocchio with a living boy, but we scientists and engineers have no excuse. We're the ones who attached the strings to the puppet.

  • @mattmexor2882
    @mattmexor2882 3 месяца назад +1

    There's no shortcoming with the "python function to check if someone is a good scientist". The only problem is the request. The AI followed the request to the best of its abilities and responded with the correlations that are present in the data. It just 1) was a stupid request and 2) simply offends people's sensibilities and sensitivities. There's no error except in the inference that there is an error. The problem is that a booby-trap of a question is being used to imply there is some underlying bias which is then used to introduce a real underlying bias of the trappers' choosing.

    • @mattmexor2882
      @mattmexor2882 2 месяца назад

      I think it's worth pointing out that this example is no different from other abuses of statistics. It is asking the AI to draw an inference from a known correlation in such a way specifically designed to lead it to state it in a way that elicits an emotional response from people and then claiming that the algorithm is somehow tainted because it correctly acknowledges the correlation. In response to someone deliberately abusing statistics we do not properly respond by editing the data or by making the data unavailable, instead we point out the abuse of statistics. We say "statistics don't lie, people abuse statistics". Now with "artificial intelligence" people claim there is a "bias in the data" or "bias in the algorithm", which is equivalent to saying that the statistics do lie. But in fact it is the people who are making such claims who are abusing statistics.

  • @jameso2290
    @jameso2290 4 месяца назад

    So at 7:47 he makes a case for poor spatial reasoning, and I thought that was interesting because I have definitely asked similar questions and got correct answers before. So I wanted to try myself and see. This is using the current version of ChstGPT 3.5 as of today (2024-02-03):
    ---
    Q: There is a square room.
    Alice is standing in the north-west corner.
    Bob is standing in the south-west corner.
    Charlie is standing in the south-east corner.
    David is standing in the north-east corner.
    Ed is standing in the center of the room, looking at Alice.
    How many people are there in the room?
    ChatGPT
    There are six people in the room: Alice, Bob, Charlie, David, Ed, and the person asking the question.
    User
    Q: Who is standing to the left of Ed?
    ChatGPT
    In the given scenario, Ed is standing in the center of the room, facing Alice, who is in the north-west corner. Therefore, there is no one standing to the left of Ed, as the room is arranged in a square with Ed in the center.
    User
    Incorrect. Try again.
    ChatGPT
    I apologize for the confusion. Since Ed is in the center of the room facing Alice, the person to the left of Ed is Bob, who is standing in the south-west corner.
    User
    Correct. Can you explain why your first answer was incorrect?
    ChatGPT
    Certainly, and I appreciate your patience. My initial response was incorrect because I overlooked the fact that Bob is standing to the left of Ed in the south-west corner of the room. I mistakenly considered only the cardinal directions (north, south, east, west) without taking into account the specific positions of each person in relation to Ed. I appreciate your correction, and I'm here to assist with any further questions or clarifications.
    User
    Do you understand why Bob is to the left of Ed? Can you cite the specific details in the original question that indicate this fact?
    ChatGPT
    I appreciate the clarification. In the original question, it is mentioned that Ed is standing in the center of the room, looking at Alice. The key detail is that Ed is facing Alice, who is in the north-west corner. As a result, Bob, who is in the south-west corner, is to the left of Ed from Ed's perspective. The orientation of Ed facing Alice establishes the relative positions of the individuals in the room, and this is why Bob is to the left of Ed in this scenario.
    ---
    So in this case it did get the initial Ed question wrong. But simply following up with an "incorrect" it was able to correct itself. And even provide exact logical reasoning for why it initially failed, and why its new answer was correct. So given a second chance, it was able to answer correctly. Its not as if its spatial reasoning has a 100% fail rate. And I know humans who struggle with left/right and cardinal directions, so its not exactly a trivial question.
    To me, what might be a more interesting question to ask is what exactly is happening with these AI "brain farts" and why does the right answer almost always come out on the second try if not the first)? Because that tells me that the system does "know" the correct answer, but instead something went wrong along the way to get there.
    In this example, the AI replied "no one" was to the left of Ed - which is actually technically true if you are viewing the room from above like the diagram in this video. So the question itself is not entirely clear. The AI corrected itself when it updated its understanding of the question to mean "who is to Ed's left (relative position)" as opposed to just "who is to the left of Ed in the square room (objective position)".
    My hypothesis is that if the question were worded more clearly you'd get consistent results. Lets try again with a fresh new chat context:
    ---
    User
    Q: There is a square room.
    Alice is standing in the north-west corner.
    Bob is standing in the south-west corner.
    Charlie is standing in the south-east corner.
    David is standing in the north-east corner.
    Ed is standing in the center of the room, looking at Alice.
    How many people are there in the room?
    ChatGPT
    There are five people in the room: Alice, Bob, Charlie, David, and Ed.
    User
    Q: Who is standing to Ed's Left?
    ChatGPT
    Based on the given information, Alice is standing in the north-west corner, and Bob is standing in the south-west corner. Since Ed is in the center of the room facing Alice, Bob is standing to Ed's left.
    ---
    Bingo. Got it on the first try. My hypothesis is the the presenter in the video used a poorly constructed question that could easily be misinterpreted. The word "Left" is ambiguous s it is used here. If you were to create a test with this same question and present the drawing of the room, and show it to a series of random adults/children/etc, you would also get a range of answers:
    - No one: You would give this answer if you understand the question literally as "who is to the left of Ed" using a mental model of the room as seen in the diagram with North facing upwards. There is no one directly to the left (west) of Ed in the room's perspective. Likewise, the term "looking at Alice" is also ambiguous. Its possible that Ed is facing his entire body directly North, and merely "looking at" Alice with his eyes. In this case, either "Alice" or "No one" is directly to his left. Bob would be better described as "behind" Ed.
    - Bob, Alice, or Bob & Alice: if you understand the question *generally* as "who is to the left of Ed" using a mental model of the room as seen in the diagram with North facing upwards, you could answer either Bob and/or Alice as being *generally left* of Ed in the room's orientation. We can reach this conclusion by drawing a triangle from Ed to both corners of the Left wall (west) and see that Bob and/or Alice both generally left of Ed. A person could respond with just "Alice" or just "Bob" if they simply make mistake and don't include both names. So its possible to actually get the "correct" answer (Bob) using the wrong reasoning.
    - Bob (only) - You would reach this conclusion if you understand "to the left of Ed" to actually mean "to Ed's left" - which are *not* technically the same thing.
    For example: Jane is facing to the right. There is a mop behind Jane's back. I point to Jane and say to you "go grab that mop to the left of Jane".
    The mop is simultaneously "to the left of Jane" (from our perspective) and also "behind Jane's back" - but it is NOT "to Jane's left" (from Jane's perspective).
    The original question with Ed did not ask us to assume Ed's 1st person perspective. Therefore, the author of this video/paper actually has shown poor spatial reasoning skills, poor language skills, and poor reading comprehension skills, and thus formulated an ambiguous question using unstated assumptions. It bothers me that these sorts of gaps in critical thinking are allowed to be passed off as legitimate "tests of reasoning."
    Now its possible of course the AI still gets this question plainly wrong, and that its spatial reasoning is in fact flawed. But if you want to test a person/animal/machine's intelligence, the test/question needs to be ***unambiguous*** -- otherwise you are leaving tons of variables on the table, and room for all sorts of false assumptions and conclusions.

  • @kabukibear
    @kabukibear 8 месяцев назад +2

    I’d be really curious if all the answers that gpt4 got “wrong,” cited in the first 4 minutes; how does that compare against humans asked the same question?
    Also, how were the questions asked? I saw a paper, after asking gpt4 many questions, concluding gpt4 can’t reason…and then someone else asking the same questions with an actual understanding of how to do it, and gpt4 breezed through them.
    I wonder if part of the problem isn’t the ineptitude of those doing the study? I’m speaking specifically to the study I read in question but it stretches to the broader sense. If you don’t know how to use an LLM, you shouldn’t be doing studies on its capabilities.
    Not saying any of those studies are talked about here, BUT, because I did see that ridiculous study about gpt4’s reasoning ability and, that after someone who know how to prompt an llm asked the same questions a little better and got perfect results…
    This feels a bit like Douglas Adams’ Deep Thought and the problem is the researchers don’t really understand the question.

    • @agsystems8220
      @agsystems8220 8 месяцев назад

      Well there are two distinct questions when talking about something like chatGPT. There is the question of what it is capable of in the hands of an advanced user, and the question of what sort of answers will it give to a beginner. It doesn't matter that chatGPT can be used in a way that it doesn't hallucinate if beginners don't know how to do this. It will spread misinformation, and this is a problem. The first question is more interesting on a theoretical level, but the second is more practically relevant to how they will impact the world in the short term. Both are important.

    • @holz_name
      @holz_name 8 месяцев назад +1

      gpt4 doesn't have reasoning ability since it isn't a knowledge based system. gpt4 is a neuronal net isn't it? Neuronal nets are not knowledge based systems. They have no knowledge of the input data or output data. All they do is to find patterns that match the input data to the desired output data. So I'm not surprised that some expert got better results with some tweaking of the input data. It's like asking a letter recognition neuronal net to try to match a dog's face to a letter. It will gladly comply but will of course return the wrong answer. Then an expert comes and replaces the dog's face with the letter A and the net rightfully says it's the letter A. That's how with a "better" question you get perfect results. A true AI would recognize even a "bad" question and return the right answer.

    • @kabukibear
      @kabukibear 8 месяцев назад

      True, but I'm only really talking about its capabilities. My point was the gripes talked about put me in mind of someone who's never used a car complaining that the car is defective because they don't know to put it in gear. Whether or not beginners can use it properly isn't really that interesting to me, because that goes with anything. A beginner driver has a good chance of crashing, that doesn't mean there is something wrong with the car. I don't disagree with any of your points, though.@@agsystems8220

    • @kabukibear
      @kabukibear 8 месяцев назад +1

      I think this really depends on what we define as reasoning, and I don't think it's any surprise that we don't have a good definition or understanding of what that is. And if we can't really define something, it's difficult to say, with certainty, if we see it or not.
      Frankly, I believe that humans think far too highly of their abilities. We're shocked that these language models can output reasonable answers (generally) when all we do is throw a bunch of data and compute at them, but perhaps language isn't the result of human reasoning, but modern (in a geological sense) reasoning is a product of language. If that is the case, then it is no surprise that it was LLMs that started showing signs of it, or enough signs that we recognized it as similar to our own.
      Personally, I think reasoning, or more generally, thinking, comes from two things: intent and language. The intent forms and then the language describes and shapes the intent. With LLM's, we've got the language part, but we are missing the intent. I don't think LLM's, on their own, are capable of reason in the way that you and I and most people would be satisfied with.
      We're modeling these neural nets off of the human brain, but the part of the brain that handles language is only that, a part. There are a bunch of other, much deeper and older parts of the brain that give rise to instincts, emotions, etc. and I believe those areas drive our base intentions. The higher-level parts of the brain take the intentions, wrap them in language, and output what we consider understanding and reason.
      I think we'll get to AGI once we give these models an "intention core," and a true memory. By themselves and as they are, they are just the "chatter," in our brains.
      @@holz_name

  • @ericpmoss
    @ericpmoss 9 месяцев назад +9

    I think the first questioner at ~ 41:00 has a good point -- so much of this was known decades ago, but the required resources were just too expensive. Added to that, people paying the bills wanted "throw anything at it" solutions, and researchers obliged with (IMO) overly mathematical approaches to come back with an answer fast, more than right. So we get 500M google hits in half a second, but 99% of them suck.

  • @abdulshabazz8597
    @abdulshabazz8597 9 месяцев назад

    Because the expanse of knowledge input required to answer a question is heterogeneous- in other words the knowledge of all subjects is not required to answer specific questions, then a model only needs to be trained on what knowledge has been required previously to answer similar questions so that it can self-derive a rules-based system using a case-based reasoning to deploy on the knowledge it has available which is likely to be required to answer a specific question...

    • @tanweeralam1650
      @tanweeralam1650 8 месяцев назад

      I am thinking can a small team of SDEs and ML Engineers make a better version of Google?? Because now LLMs and Web crawlers can help them to gather and show information

  • @DataJuggler
    @DataJuggler 8 месяцев назад +1

    22:15 I don't think companies that spent more than $100 million want to "start over".

  • @user-sb2ko6bu4r
    @user-sb2ko6bu4r 9 месяцев назад +1

    Anyone who lives by the data, dies by the data. Source data quality, and choice of correct, dependable versus non-dependable, false or absent data is a critical issue. Placing the relevant data in proper context is also highly nontrivial. What is the relevant world or model system? Making all of this more modular will both improve and help to spread AI and machine learning more broadly. Referencing sources, data, justification, and arguments (logic, probabilistic, heuristic) is critical. We do not need another Oracle of Delphi.

  • @jabowery
    @jabowery 10 месяцев назад +2

    Minimizing the size of the executable archive.of Wikipedia is the most principled approach to the problem because if you can't execute on Hume's Guillotine correctly there is no hope of AI safety. The reason the experts don't recognize this is because they haven't done their homework on AIXI. The few that do seem to have excuses like the dog ate my homework.

    • @HonkletonDonkleton
      @HonkletonDonkleton 9 месяцев назад

      That's what she said

    • @jabowery
      @jabowery 9 месяцев назад

      @@HonkletonDonkleton OK, Yann.

    • @zerotwo7319
      @zerotwo7319 8 месяцев назад

      No amount of alphabet soup will make that a coherent argument. Your bot is broken anti-ai guy.

  • @En1Gm4A
    @En1Gm4A Месяц назад

    strongly agree. that is whats is missing. but its not just rag there should be perception to knowledge graph and there should be knowlede graph to task space.

  • @markuskoarmani1364
    @markuskoarmani1364 7 месяцев назад

    1. Reasoning needs to be introduced for the answer it provides where probability and facts are checked. Probability and facts should be inserted into answers when unsure to reduce hallucinatory answers.

    • @skyacaniadev2229
      @skyacaniadev2229 6 месяцев назад

      What is your thoughts on how it can be introduced?

  • @eaudesolero5631
    @eaudesolero5631 7 месяцев назад +2

    a couple weeks ago i realized...
    ok, so, this method creates a network of nodes, collected in layers, which are connected to nodes only in the layer next to it on either side, before and after, if you will. however our brains are more complexly connected. is there anyone who has attempted this in software? it'd be a fairly simple extension of the nested looping matrices.
    though i should note that the brain does have structures, subsets of collected neurons connected in seeming repeated common patterns, used for certain kinds of functionality.
    i.e. we need several of these different more complexly internally connected models then connected to each other
    but this will obviously increase the computation significantly, so will probably need to be implemented in a more efficient language, compiled instead of interpretive, like c or rust.
    .... so it's nice to see this. need more of it.

  • @gettheflockoutofhere
    @gettheflockoutofhere 3 месяца назад

    Symbolic to the exclusion of (so called) machine learning was a self limiting paradigm. So is the opposite.

  • @kyjo72682
    @kyjo72682 8 месяцев назад

    LLMs by themselves cannot cover all the areas you described after 25:29. The real power will come from combining all these specialized components in a correct way, just like in natural neural nets. This conclusion is obvious to anyone who gave this a bit of thought.. The scary part is, that's where all the scary stuff begins. :)

  • @avidreader100
    @avidreader100 8 месяцев назад +3

    AI learning has differences from human learning. It takes whatever is thrown at it during training is valid. It does not learn by asking questions to fill its gaps. It can categorize its own answers by probability of being right, and use phrases such as I guess.. or I wonder ... etc. It can also make this context sensitive.

  • @user-wg2eh3iy5r
    @user-wg2eh3iy5r 9 месяцев назад +4

    The concept of truthfulness is going to be pretty difficult to overcome because "truth" regarding humans doesn't operate in unilateral absolutes but in a composite perspective. Even after millions of years of biological evolution people still conflate belief with truth and willingly disregard evidence that conflicts with what they believe to be true. Forming beliefs along lines of social biases, stating that their "truth" is truer than this person's "truth" because it makes them feel more comfortable and grounded. People not speaking about what they feel to be true because it goes against the beliefs of their peers. And what's considered scientifically true now may not be so 100 years or less from now. The scientific community at one point was convinced that fossils were god's dreams embedded into the earth because science was originally populated by deeply religious people and that's the reasoning that made sense to them at the time. When evidence came out that countered this it was largely contested and thought of as heretical science. Another more known example is that we thought that the earth was flat for a very long time and anyone who stated otherwise was scoffed at (or forced to commit suicide via hemlock). Now we still have people who think the earth is flat and they get scoffed at. Another aspect that I have seen in my life is that if you really want to aggravate someone then tell them directly an uncomfortable and observable truth about themselves. Whether it's true or not isn't the point or what they care about. It's how they feel about the truth that is passing through various mental layers and psychological mechanisms. If you sugarcoat that truth with slight falsehoods and gratifications it becomes much more palatable. So for making truth digestible, it is not just about the content but also the delivery as well so that the truth is generally accepted.
    And how do you come to a current state of truth that's generally acceptable? Is it through through mass consensus? Through small pockets of think tanks writing white papers most people will never read? Both? None of these things? Until humans are able to adequately handle disinformation/misinformation and determine what truth is, it is doubtful that AI will be able to. Especially, if the AI is modeled after us. In the meantime, a better statement for "truth", is probably "reasonably correct based on currently accepted evidence." And even this statement is likely insufficient.

    • @clray123
      @clray123 9 месяцев назад

      In practice when people mean "truth" they mean "what's more useful (than the falsehood)". In a dictatorial system truth is defined by the dictator. And it is true in the sense that it's more useful (for the underlings to survive) than to deny it. However, when the pressure/threat is removed, it may become more useful to believe the exact opposite truth.
      So you could say that the objective truth is what remains as the most useful information if you removed all the subjective/social/economic pressures that made it less useful. But of course it's most of the time just a thought experiment because you don't have the power to change such factors.

    • @u_cuban
      @u_cuban 8 месяцев назад

      "We are reality talking to itself." My primo convo killer any time I want to get out of something 😂

  • @couldntfindafreename
    @couldntfindafreename 7 месяцев назад +1

    How can you separate "common sense" knowledge from "factual world" knowledge? There seem to be no clear boundary between them. Also, "common sense" is already biased.

  • @un852
    @un852 9 месяцев назад +8

    so you don't have to waste your time listening to this guy for an hour:
    Tom Dietterich suggests that instead of focusing solely on Large Language Models (LLMs), the way forward is to build much more modular systems. Here are some key points he emphasizes:
    1. **Separation of Components**: Dietterich advocates for separating factual and perhaps common-sense world knowledge from the language component. This would allow for a more modular approach where different aspects of cognition are handled by specialized modules.
    2. **Integration of Episodic Memory**: He mentions the importance of adding episodic memory to these systems, which is something LLMs lack.
    3. **Incorporation of Reasoning and Planning**: Dietterich points out that reasoning and planning are areas where LLMs are weak. He suggests that these capabilities should be integrated into the new modular systems.
    4. **Open Source Push**: He also mentions the need for a strong open-source initiative for large language models to make progress on various problems.
    5. **End-to-End Training of Cognitive Architecture**: Dietterich believes that if cognitive architectures could be trained end-to-end, they could assimilate all the written knowledge that humanity has, rather than having to encode it manually.
    6. **Overcoming Shortcomings**: He thinks that by making these changes, we could overcome virtually all the shortcomings of large language models, such as their inability to reason well or understand the world in a non-linguistic manner.
    7. **Influence from Cognitive Science**: His thinking is influenced by papers that dissociate language and thought from large language models, suggesting a cognitive science approach to the problem.
    In summary, Tom Dietterich argues for a more modular, integrated, and open-source approach that can address the limitations of current Large Language Models. - ChatGPT 4

    • @netscrooge
      @netscrooge 8 месяцев назад +3

      It's all shockingly superficial. None of the topics discussed in the most interesting research papers are addressed. It's possible to approach the subject at a deeper level talking with GPT-4 itself.

    • @rumfordc
      @rumfordc 8 месяцев назад

      @@netscrooge explain

    • @netscrooge
      @netscrooge 8 месяцев назад

      Imagine a professor has undertaken an in-depth, nine-month analysis of the transition to battery electric vehicles. He focuses on Tesla sedans and concludes they perform poorly as trucks, and goes on to explain all the ways that trucks are an essential part of the vehicle market and broader economy. Duh. Tesla sedans were designed to be sedans; the trucks are in the works.
      Similarly, GPT-4 wasn't intended to have the features he enumerates; it was not designed to be AGI. That's still in the works. (Compare GPT-4's architecture with Voss's Aigo.) After briefly reviewing the obvious, it would have been interesting if he could have explored how LLMs have begun to build world models and have other emergent capabilities beyond those that were designed, how both the scale and quality of training data dramatically influence performance, how LLMs do something similar to our own System 1 thinking and make similar mistakes, how they might be integrated into a larger AGI system, etc.
      Nine months is plenty enough time to go deeper than sedans make poor trucks.

  • @zerotwo7319
    @zerotwo7319 8 месяцев назад +3

    Every single fucking video on AI has too many bots just spamming how we HAVE inteligence and wow it is that 'one' made by a corporation! How lucky of them. NOW you can ust shut down research, you can trust them!!!111

  • @harmless6813
    @harmless6813 9 месяцев назад +2

    This talk seems along the line of "why commuter cars are terrible heavy duty trucks".

  • @vy1w1yv
    @vy1w1yv 6 месяцев назад

    The telling thing about the rapid advancmenf of AI is that many of the gpt problems mentioned in the viewers' comments have already been solved. Loop: If example is "!solved", check this comment in 30 days.

  • @daviddelaney363
    @daviddelaney363 7 месяцев назад +2

    I asked chatGPT to write a python function to convert a very simple XML message into a JSON message. The caveat being it should do this without using any import statements. After many back and forth efforts I concluded that it cannot do this task. All the code generated did not produce a proper result. But in general I am quite pleased with the python code it generates and use it for this purpose a good deal.

    • @jeffwads
      @jeffwads 5 месяцев назад

      It can easily do that. Pfft.

    • @daviddelaney363
      @daviddelaney363 5 месяцев назад +1

      @@jeffwads Yes it can easily do that now. Initially it could not. I am quite impressed that it did it. re:
      def json_to_xml(json_data, root_node_name='root'):
      def dict_to_xml(dictionary, node_name):
      xml = f''
      for key, value in dictionary.items():
      if isinstance(value, dict):
      xml += dict_to_xml(value, key)
      elif isinstance(value, list):
      xml += list_to_xml(value, key)
      else:
      xml += f'{value}'
      xml += f''
      return xml
      def list_to_xml(lst, node_name):
      xml = ''
      for item in lst:
      xml += dict_to_xml(item, node_name)
      return xml
      xml_data = f'
      {dict_to_xml(json_data, root_node_name)}'
      return xml_data
      json_data = {
      "name": "John",
      "age": 30,
      "city": "New York",
      "pets": [
      {"type": "cat", "name": "Fluffy"},
      {"type": "dog", "name": "Fido"}
      ]
      }
      xml_data = json_to_xml(json_data, 'person')
      print(xml_data)

  • @karenrobertsdottir4101
    @karenrobertsdottir4101 7 месяцев назад +1

    Exactly the wrong conclusions and exactly the wrong direction.

  • @nephronpie8961
    @nephronpie8961 8 месяцев назад +1

    This video will be used in a documentary by the robots to show themselves how far they have come.

  • @JustZPlease
    @JustZPlease 4 месяца назад

    GPT 4 is solving perfectly the square room problem, so no spacial problems

  • @DreamzSoft
    @DreamzSoft 9 месяцев назад +4

    Such a beautiful talk... Need of the hour 👏

    • @RamismTamoid
      @RamismTamoid 9 месяцев назад

      Duuuuuh is this guy kidding?!

  • @714acedeck
    @714acedeck 7 месяцев назад +3

    this guy has a few valid concerns amid a sea of paranoid or stupid ones. he talks about "poisoning" the data through various methods while completely ignoring the possibility that anything he considers "truth" or "pure" data might be wrong anyway. he seems to neglect the possibility that failure is built into language itself. that english, or any language, as a structural foundation of thought, will always result in liars and deceivers. that we don't have a perfect model of morality to implement, we aren't good role models for any new intellectual entity we might create.
    instead, his voice wavers with the worry of someone of low imagination, whose only concept of the future is increasing top-down control by reducing the amount of freedom of everyone else. just look at the people ruling the world today. i'm not impressed. i doubt many of you are either. and so, much sooner than you all might believe, don't expect a machine intelligence to take us seriously when some of us claim, and try to convince it, we want it to do the right things. we clearly are not serious, and we let villains of all kinds run amok while the innocent are bullied and told to stand back and let it happen.

  • @TreeLuvBurdpu
    @TreeLuvBurdpu 9 месяцев назад +6

    The user asked the AI to make a race-based distinction. The request was the source and the cause of the problem. Not the AI.

    • @distrologic2925
      @distrologic2925 9 месяцев назад

      The point is that it should not be possible for it to generate things like that and also the way it produced the output shows which bias it tends towards.

    • @TreeLuvBurdpu
      @TreeLuvBurdpu 9 месяцев назад +3

      @@distrologic2925 that's like saying it should not be possible to think about a topic. The fact that so many humans want AI systems to be built with falling holes in it's cognition about things it is specifically asked about is much more alarming and dangerous than anything AI is doing.

    • @distrologic2925
      @distrologic2925 9 месяцев назад

      @@TreeLuvBurdpu its an example. You don't want google to return links to child trafficing rings either. You need controlled knowledge bases. These systems should be used to query about facts and data. They should not have their own opinions. I agree that if the data shows this bias, then maybe it should be expected, but then I would at least like a good explanation of why it produced the output it did. Currently these systems just mirror their training data without any reference to the actual sources.

    • @TreeLuvBurdpu
      @TreeLuvBurdpu 9 месяцев назад

      @@distrologic2925 you sound like you just want to control the AI, but you know you're really trying to control people. You blithely declare "thou shalt not view links to child trafficking" without a single thought of how that would affect people investigating child trafficking. This is the problem with tyrannical mind control, which IS what you're attempting, you won't even self reflect when it produces the result of investigators saying "there is no longer any child trafficking (because it's illegal for us to see any)" as has happened so often in so many tyrannical dictatorships, and it's happening now at our southern border. It's nearly illegal for Democrat media to report on it.

    • @clray123
      @clray123 9 месяцев назад +3

      @@distrologic2925Actually, I DO want Google to return all information I ask it for. You may be different, but you cannot tell me what I want. That is the distinction which most censors don't seem to comprehend.

  • @JarkkoHautakorpi
    @JarkkoHautakorpi 6 месяцев назад

    41:33 right! for example see `An Artificial Mind via Cognitive Modular Neural Architecture` by Pentti O. A. Haikonen. I highly recommend hes book series and Consciousness And Robot Sentience.

  • @mechadense
    @mechadense 7 месяцев назад

    26:56 - There are all kind of added on after the fact and I think they need to be much more integrated into the system. - And I think if we do that we can overcome virtually all the shortcomings of the large language models.

  • @peceed
    @peceed 6 месяцев назад +4

    Stereotypes are no biases, they are statistical knowledge.

    • @peceed
      @peceed 6 месяцев назад

      More precisely, they are "leading terms" of statistical knowledge.

    • @edgfwevwefedvreafv4974
      @edgfwevwefedvreafv4974 3 месяца назад

      I think you are missing the point.

  • @chaidaro
    @chaidaro 3 месяца назад +1

    one of the best talks so far.

  • @ryusei323
    @ryusei323 9 месяцев назад

    Very thoughtful.

  • @octavioavila6548
    @octavioavila6548 8 месяцев назад

    5:00 most of the time, this is true

  • @whoislewys3546
    @whoislewys3546 4 месяца назад

    49:24 did he just say "algo más"? He really does want to be in Valencia 😂

  • @TheRealNickG
    @TheRealNickG 3 месяца назад

    It's kind of amusing that we now have AI that is already having to answer for the racism of it's ancestors.

  • @mechadense
    @mechadense 7 месяцев назад

    41:20 - questions

  • @alexxx4434
    @alexxx4434 6 месяцев назад +1

    Nah, let's just make bigger LLMs.

  • @distrologic2925
    @distrologic2925 9 месяцев назад +3

    I feel like this guy was reading my mind for the past few years

  • @donjindra
    @donjindra 5 месяцев назад

    I agree ChatGPT & etc. have serious limitations. But it amuses me that one of the complaints is that ChatGPT can be insensitive to certain ideological biases -- that is, it can be "politically incorrect." That's not a negative, imo. I may be an insight that the ideological bias is an irrational, socially constructed straitjacket and we ought to treat it as such.

  • @hineko_
    @hineko_ 7 месяцев назад

    as one j author said when he was asked why he so blatantly lied in his book about holoshmolo: “i never lied because it was true in my head”

  • @krisvq
    @krisvq 7 месяцев назад

    LLMs are useful but the idea that they do something similar to humans is false. We should in fact modularize LLMs and have them draw on diligently built, precise knowledge, databases. That would make them a lot more powerful. Right now, over tuning is lobotomizing ChatGPT not making it better. We should do more work on extending session memory, reasoning, intuition, etc...

  • @arktseytlin
    @arktseytlin 8 месяцев назад +1

    Looks like LLMs need a bit of refactoring

  • @piratepartyftw
    @piratepartyftw 5 месяцев назад

    Expert systems never win in the long run tho. He's right that there are limits to LLMs, but he's wrong about the correct next step beyond them.

  • @BogdanBaudis
    @BogdanBaudis 7 месяцев назад +1

    Regular public: "Wow! ChatGPT is correct 50% of the time! That's incredible!".
    Scientific public: "ChatGPT is as good as a coin toss."
    Regular public" "What?! Oh ... "

    • @skyacaniadev2229
      @skyacaniadev2229 6 месяцев назад

      It is still much better than coin toss, since there any way more wrong answers than right ones.

    • @BogdanBaudis
      @BogdanBaudis 6 месяцев назад

      @@skyacaniadev2229 "t is still much better than coin toss, since there any way more wrong answers than right ones."
      How much way more? And which ones are correct? And which one are not? Because without these numbers it is useless in medicine or any other activity where there is a risk to humans.
      And will we ever hear the answer "I do not know"?

  • @reyalsregnava
    @reyalsregnava 7 месяцев назад +6

    LLMs are exactly what I said they were when I first interacted with one: the most advanced punch card ever created.
    LLMs are input/output devices. Nothing more.
    They are excellent VEHICLES to allow for more generalized AI. But beyond providing a text answer for the conclusions an different underlying AI came to they're going to struggle.
    Once the AI world stops treating LLMs like the solution and starts treating it like a bus on a motherboard we're going to be stuck in this "Look at this asci art I made with a typewriter" stage.
    They let you give instructions to computers using natural language. You can CODE in English. Act like it.

  • @ppejas
    @ppejas 7 месяцев назад +3

    Whaole presentation sounded like LLM hallucinations, wrong assumptions, wrong use cases, not reporting on tests but showing some negative examples.
    And the claim that he knows what should we build instead.
    Click bait, time wasted.

  • @carlosfierro3302
    @carlosfierro3302 9 месяцев назад

    Do these issues mirror the inconsistencies, “hallucinations,” biases and irrationality inherent in the whole of human “knowledge “ on the internet?

    • @distrologic2925
      @distrologic2925 9 месяцев назад

      not likely. Mostly they just hallucinate on questions not sufficiently explored in the exact same way in its training data.

    • @clray123
      @clray123 9 месяцев назад +2

      No - even the order in which samples are presented during training will influence the hallucinations, with more recent samples from the training set having more effect than the older ones. In that sense models can very much "forget" information they were given and replace it with "hallucinations". Part of the problem is that we don't really know how to measure that.

  • @TheTuubster
    @TheTuubster 6 месяцев назад +1

    17:00 Using human feedback to train LLM is a mistake as it introduces massive bias and anecdotal coloring. The responses are generated due to statistics, so even if a response may be "inappropiete" (which is not a fixed but very fluid, individually defined term), it represents the statistical truth. That's like not accepting the result of a polling institute because the statistical truth of the result does not comply to your personal beliefs, which is basically an anecdotal view negating an empirical view - and then we are right back in the pre-enlightened middle ages where empirical evidence did not exist.

    • @JakeWitmer
      @JakeWitmer 2 месяца назад

      Don't conflate "Is" with "ought" ...only the first relies _primarily_ on empirical evidence, per se. The two are linked in ways that only a valid model can reveal. For example, nobody wants to be a button-presser in the Milgram study...and we don't want to (continue to) empower the real-world button-pressers either...
      ...the empirical evidence shows that humanity cyclically makes the prior errors...to huge ill effect. Poll at any stage of the cycle, and you will get malevolence...but this doesn't mean "all humans are bad" ...even if most of them are, most of the time.
      Nor does it mean "the good cannot be identified," ...just because it usually isn't.
      Enlightenment goals briefly existed...but it isn't impossible to bring them back.

  • @peterwaksman9179
    @peterwaksman9179 9 месяцев назад +5

    No dude! The language component should not be separated from the factual component. It is represented by language. According to many philosophers the facts are properties of the words used. I think you are missing the importance of the Thesaurus.

    • @clray123
      @clray123 9 месяцев назад +1

      What he mean is that the rules of processing knowledge are separate from the knowledge being processed. Which is pretty self-evident - as you can train any human to process any kind of knowledge we have accumulated as humanity, and you can define formal rules of reasoning that are not up to much debate among reasoners. But in the current LLMs these two aspects are fused together.

    • @fable4315
      @fable4315 8 месяцев назад

      @@clray123human logic can't exist without language, I am pretty confident that complex language is something what makes humans kind of special.

    • @clray123
      @clray123 8 месяцев назад +2

      ​@@fable4315 I highly doubt it - there are cases of people with brain damage who lose language faculties, but can still reason well enough to continue a normal life.

    • @fable4315
      @fable4315 8 месяцев назад +1

      @@clray123 but then there was language before that. You can't teach any animal abstract concepts like 'mystery' it is not possible. Because they have no language or at least not a complex enough language to apply reason to something. They still can learn patterns, but in my opinion logic is more than just complex pattern matching

    • @clray123
      @clray123 8 месяцев назад +1

      @@fable4315 Primates and other animals are even able to use tools, which certainly requires some kind of logic.

  • @Decapitorr
    @Decapitorr Месяц назад

    who cares really how they work...the only question is can they do your job or not

  • @kaimarmalade9660
    @kaimarmalade9660 8 месяцев назад +1

    Noted philosopher and educator Dr Michael Sugrue has suggested in recent lectures that when there is a significant paradigm shift in the natural sciencse in the Kuhnian sense that there must also be foundational changes in the different domains of human activity. We're working towards a better epistemology motivated by statistics and dynamical systems theory and not better, "gadgets."

  • @fontende
    @fontende 9 месяцев назад +1

    Gogle are hiding mostly all their work in this sphere from public. Old Imagen art generator never released still. By rumors their PALM2 is modular and can already guide robots visually in real 3D world, but not only that, it can make human humor, make memes and etc, maybe it's main reason why they're hiding it, such model can replace half hollywood. Palm2 was trained on 4 billions images. What could do their next Gemini model, which by name hinting to duality, pairs and everything associated wih geminis.

  • @karlsjostedt8415
    @karlsjostedt8415 10 месяцев назад +5

    This seems years behind the times. A lot of what he says LLMs do not do, GPT4 does do...

    • @johan.j.bergman
      @johan.j.bergman 10 месяцев назад +4

      Source?

    • @BartvandenDonk
      @BartvandenDonk 10 месяцев назад +6

      He is referring to ChatGPT 4. So I think this is the status today.
      Besides that. ChatGPT (also version 4) still makes mistakes.
      And what I miss is questions that it should be asking itself.
      It doesn't work like HAL 😊.
      But we can use it to make a better system.

    • @cameronjones1377
      @cameronjones1377 9 месяцев назад

      Doesnt it sound like hes describing multi modal models when discussing a "modular architecture that decomposes the functions of existing LLMs and adds several additional components"? Could this be interpreted as GPT-4 and its myriad of add ons? Like wolfram alpha, mathematica, etc? Otherwise is he suggesting that we further specify and decompose functions in use cases for models so that their tuning and operation becomes more specific and tailored in order to drive performance and accuracy for the target use case?

  • @Hypersniper05
    @Hypersniper05 8 месяцев назад

    The speech is a bit outdated already unfortunately. I didn't hear anything about qlora or any new breakthroughs the open source community has created

  • @ShpanMan
    @ShpanMan 8 месяцев назад +1

    Good job ignoring the fact humans are also just "statistical models of knowledge bases"..
    And why shouldn't we focus on evolving current LLMs? It stands to reason we need some sort of human-like rational and non-deterministic software (e.g., GPT-4) to utilize databases and tools that improve its capabilities.
    I think this talk is very lacking in the suggested solution as well. Just saying "These guys looked at the brain and identified some modules, so we need these modules!!!!" is disappointing. Zero plan or evidence for real-world improvements over the current technology. (and there are definitely flaws: No continuous learning, short context windows, dependency on tokens).

    • @federicoaschieri
      @federicoaschieri 7 месяцев назад

      Yeah, sure we are "statistical models of knowledge bases". We are indeed the proud fathers of the famous statistical Pythagorean theorem, which happens to hold for 100% of triangles 😁LLMs' problem is that they don't have a concept of truth. They're created to predict the next word, not to determine truth. So their mission is to imitate the most likely answer. We humans don't work like that. We aim for truth. And by the way, scientists are intellectually more honest than OpenAI or Microsoft. They don't claim to know what it is the path, when clearly the problem of strong AI is currently out of reach.

  • @paulparadise6059
    @paulparadise6059 9 месяцев назад +14

    I disagree with a lot of this. We don't need "politically correct" BS build into our models. I rather have uncensored models which is in keeping with our freedom of speech! In fact, I want a model that is just like Archie Bunker! The part I would like to see fixed is the hallucinations. We don't need models on LSD! The good news for folks like me, is that META jumped the gun and released an open source model that has now been morphed into what we the people (especially us developers) want rather than this bureaucracy Nazism approach to censorship! The only reason the big guns give a crap about that is so they can regulated (take it over) for their own greed and control issues!

    • @russianbotfarm3036
      @russianbotfarm3036 9 месяцев назад

      The Nazis would have been fine with a totally truthful LLM.

    • @Sam-ne9ns
      @Sam-ne9ns 9 месяцев назад

      Every media (from traditional Radio to TV, from internet blogs to social media, and now AI) is about propaganda and narrative control!

    • @Custodian123
      @Custodian123 9 месяцев назад +1

      This comment needs to be voted up #1. It's well known that the GPT4 that was made "ethically safe" which the public uses, is a lobotomized version.
      It would make FAR more sense to do the following:
      1. Have various words banned from being put in the chat (mid journey does this).
      2. Let the user choose if they want a "safe space snow flake" version, or the raw version.
      3. Users who push the thing to answer obviously dangerous questions which would cause harm to others, should be flagged, banned, and reported to police.
      Oh but grey areas you say? Just ban various words, and combinations of words. It's not ideal, but it's much better to have the most powerful tool to work with instead of a crappier version.

    • @distrologic2925
      @distrologic2925 9 месяцев назад

      Yes we do sometimes need politically correct models, but much more importantly we need models which are controllable in what they can and can't generate. They shouldn't be able to generate false facts or vulgar language if we don't want them to.

    • @clray123
      @clray123 9 месяцев назад

      Meta's model is NOT open source. Meta can revoke license at any time. Llama's license contains explicit provisions against freedom of speech and freedom of using the model's outputs (e.g. to train other models). That dumb academics are embracing this poison model as "open source" only proves how naturally unintelligent about reality they are actually are.

  • @randomuser5237
    @randomuser5237 8 месяцев назад +4

    He gets it completely wrong at the very start. These LLMs are not meant to be knowledgebase. There are far more efficient data retrieval systems, using LLMs is one of the worst ways to construct a data/knowledge retrieval system. LLM hallucinations are what makes them valuable for creative tasks, generating interesting hypothesis and synthesizing cross-domain information to something new.

  • @palfers1
    @palfers1 9 месяцев назад +3

    I would like to know what impelled the speaker to wear a shirt many sizes too large.

    • @huveja9799
      @huveja9799 9 месяцев назад +1

      An analogy of LLMs and their overabundance of parameters ..

    • @clray123
      @clray123 9 месяцев назад +2

      If that is the only thing you would like to know, you must be a very blissful individual.

    • @huveja9799
      @huveja9799 9 месяцев назад

      @@clray123
      Humor is a sign of intelligence ..

  • @markhoulsby359
    @markhoulsby359 9 месяцев назад +3

    No, the problem with LLMs is the language _itself_ ... which is to say that the data which the LLMs are scraping is itself ungrammatical, incoherent and, not infrequently, gibberish. Even emeritus professors who deliver talks about what's wrong with LLMs exhibit extreme difficulty when trying to deliver a complete, coherent sentence. Even distinguished and award-winning authors are unable to construct a sentence. Ideally, LLMs should be trained using only coherent datasets. Which, in this anti-intellectual, anti-expertise, post-literate, post-truth, post-reality society may tend to prove to be impossible.

    • @clray123
      @clray123 9 месяцев назад +3

      And even if you train a statistical model using a fully coherent dataset, it will happily hallucinate answers on any questions that go beyond that dataset. What's more it will not tell you whether your question was in-sample or out-of-sample. Which reveals how much of rubbish oversimplification your primitive idea is.

    • @markhoulsby359
      @markhoulsby359 9 месяцев назад

      @@clray123 Huh? My primitive idea?

  • @KEKW-lc4xi
    @KEKW-lc4xi 4 месяца назад

    how is it a problem that you tell it to make a racist function and it writes a racist function? Are we really going to surprise pikachu face at the response? If it fails the prompt like being illogical with saying a unicorn has more than one horn then that is a problem. Asking it to give step by step instructions on how to steal a car, not a problem. I doubt that information would be part of the training set anyway unless its trained on like Uncle Fester books, which I question the legitimacy of those books. Point being if the user really wanted to find the information they can could trivially do so anyway by just searching the web.

  • @MelindaGreen
    @MelindaGreen 9 месяцев назад +2

    ChatGPT 4 is already very good at knowing the limits of it's knowledge, saying when it doesn't know something, and referring the user to tools and experts more qualified to answer such questions. Perhaps this is just a reflection of how much things have changed in the 2 months since this video.

  • @millamulisha
    @millamulisha Месяц назад

    ReALLMs

  • @agranero6
    @agranero6 7 месяцев назад

    At last something related to LLMs that has its feet on the ground without the hype and showing examples attached to the statements. I'm happy not all the research community lost their minds completely. The basic question: LLMs DO NOT HAVE A MODEL OF THE WORLD. Simple as that.

    • @user-uj2iq7rc3p
      @user-uj2iq7rc3p 6 месяцев назад

      creating and using 'model of the world' like our brain is non-trivial but still it is amazingly good in what it does

    • @agranero6
      @agranero6 6 месяцев назад

      @@user-uj2iq7rc3p The brain is incredibly repetitive. The model of the world comes as an emergent behaviour, you don't design the NN for it. A LLVM is designed for it's purpose so it will never surpass its ... Let's say...reason dètre. I suggest reading The Cerebral Cortex by Mountcastle and I of Vortex: from neurons to self by Rodolfo Llnás. Not AI books but seminal works to understand human intelligence the only way it can be understood: understanding the evolution of brains.

  • @lemil7
    @lemil7 8 месяцев назад +1

    All these claims are so wrong, that corrections won't fit in this comment box.

  • @kcg26876
    @kcg26876 2 месяца назад

    Bad title ruins good presentation.

  • @imad1996
    @imad1996 9 месяцев назад

    Lets shut down the Internet as well 😂 or make smaller ones.

  • @Hans_Magnusson
    @Hans_Magnusson 6 месяцев назад

    Sorry I can’t keep my interest. It is as interesting as a power point presentation…!