Eight Things to Know about Large Language Models

Поделиться
HTML-код
  • Опубликовано: 4 июн 2024
  • 8 things experts know about LLMs (large language models) and you should know about them too.
    ► Sponsor: Prodigy 👉 prodi.gy/
    Check out our #MachineLearning Quiz Questions: / aicoffeebreak
    ➡️ AI Coffee Break Merch! 🛍️ aicoffeebreak.creator-spring....
    📜 Bowman, Samuel R. "Eight things to know about large language models." arXiv preprint arXiv:2304.00612 (2023). arxiv.org/abs/2304.00612
    📺 Debate On AGI: Existential or Non-existential? (Connor Leahy, Joseph Jacks) [MLST LIVE] ruclips.net/user/liveMw7BHDboARY
    📺 OpenAI suggests AI licenses (US Senate hearing on AI regulation w/ Sam Altman) • OpenAI suggests AI lic...
    ☕ ChatGPT vs. Sparrow: • ChatGPT vs Sparrow - B...
    ☕ Why does ChatGPT lie: • Why ChatGPT fails | La...
    ☕ Language models vs. meaning: • Can language models un...
    ☕ Can AI be moral? • Moral Self-Correction ...
    Thanks to our Patrons who support us in Tier 2, 3, 4: 🙏
    Dres. Trost GbR, Siltax, Kshitij, Edvard Grødem, Vignesh Valliappan, Mutual Information
    Outline:
    00:00 Eight facts about Large Language Models
    01:15 Prodigy (Sponsor)
    02:15 LLMs complete sentences (like ChatGPT)
    02:44 (1) LLMs get better with size
    04:06 (2) LLMs surprise with emergent abilities
    06:32 (3) LLMs seem to have "mental models"
    07:45 (4) You cannot reliably steer LLMs
    09:13 (5) Nobody knows why LLMs work
    10:46 (6) They can outperform humans
    11:35 (7) AI Alignment
    11:41 (8) LLMs make for a great first impression
    13:15 (9) They do not train for truth
    ▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀
    🔥 Optionally, pay us a coffee to help with our Coffee Bean production! ☕
    Patreon: / aicoffeebreak
    Ko-fi: ko-fi.com/aicoffeebreak
    ▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀
    🔗 Links:
    AICoffeeBreakQuiz: / aicoffeebreak
    Twitter: / aicoffeebreak
    Reddit: / aicoffeebreak
    RUclips: / aicoffeebreak
    #AICoffeeBreak #MsCoffeeBean #MachineLearning #AI #research​
    Video editing: Nils Trost
  • НаукаНаука

Комментарии • 82

  • @AICoffeeBreak
    @AICoffeeBreak  Год назад +33

    We tried something new this time. What do you think about the style? :)

    • @kevalan1042
      @kevalan1042 Год назад +7

      top notch! suggested by an LLM?

    • @flamboyanta4993
      @flamboyanta4993 Год назад +6

      I really like it. The lighting is on point. Thanks Ms Coffee Bean!

    • @stephenselby4252
      @stephenselby4252 Год назад +2

      It’s great. Thank you for showing your face and not hiding behind text-to-speech!

    • @AICoffeeBreak
      @AICoffeeBreak  Год назад +3

      @@stephenselby4252 Oh. 🙈

    • @triforce42
      @triforce42 Год назад +4

      @@AICoffeeBreak Not you nor anyone needs to show their face if they don't want, nor even use their own voice. I'm happy you're experimenting and this is one of many valid options, in my opinion. Whatever the style, I am with it for content alone.

  • @neolynxer
    @neolynxer Год назад +13

    There is a video of a student, who was accused of using ChatGPT for his essay.
    The justification professor gave was that he fed ChatGPT parts of the essay and asked if that text was written by ChatGPT, ang got the answer "Yes, it was.".
    The essay was actually written by the student himself. Even after the situration was resolved, the professor gave a "C" because he was salty, that he was wrong.
    People need to be aware of what LLMs are and never NEVER use it for anything serious. Especially never use it to self-diagnose or ask how to behave in dangerous situations.

  • @DavidBeauregardLyle
    @DavidBeauregardLyle Год назад +13

    "Truth is (like other abilities) an emergent behavior and not sticking to the truth is a silent failure from the model"
    Great information and thank you for distilling so much about this technology! There is so much to learn and work on to address the challenges of epistemic erosion (as living legends Gary Marcus and Daniel Dennett recently argued on MLST)

  • @marklopez4354
    @marklopez4354 Год назад +2

    "puppy training"...love it!🤣

  • @Mutual_Information
    @Mutual_Information Год назад +4

    Nice work! Regarding 4), you can't reliably steer LLMs.. that's a big one that many are sweeping under the rug..

    • @Hexanitrobenzene
      @Hexanitrobenzene 11 месяцев назад

      This one alone should be a deal breaker for the decision to deploy...

  • @lisatrost7486
    @lisatrost7486 Год назад +3

    A new video for my birthday!!! Thanks😍😍🎉

  • @DerPylz
    @DerPylz Год назад +9

    I love these videos that explain more general aspects! Thank you!

  • @harumambaru
    @harumambaru Год назад +6

    OMG, number 9 is so important! Also if we look on optimization function of social networks, that optimize on time spent to increase ads revenue, it distorts content into more hateful (because it gives more engagement) and this is one of the best autocomplitions that exist today in the internet. Now I am thinking that fiction and non fiction books in addition to trusted papers like Economist or New York Times would make such a good pre-trained model

    • @mgostIH
      @mgostIH Год назад +4

      Beware, point 9 isn't about "LLMs don't know what's true", but it's that they don't strictly optimize towards it, although for many contexts it's actually required they know what's real in order to minimize a predictive loss. This means that internally there are latent vectors of what you would define as truth, just like for other concepts like love. It's possible to discover these vectors and heavily bias the output of an LLM towards them or in the opposite direction, I can point you to some work about this later if you are interested.

    • @scitechtalktv9742
      @scitechtalktv9742 Год назад +2

      I also would like to know more about it (these latent vectors etc)

    • @mgostIH
      @mgostIH Год назад

      @@scitechtalktv9742 Steering GPT-2-XL by adding an activation vector

  • @binyaminramati3010
    @binyaminramati3010 Год назад +3

    maybe we can program a jailbreaker net and then let the LLM fight the jailbreaker's prompts in order to maximize the RLHF reward(somewhat like GAN)

  • @derrickxu7784
    @derrickxu7784 Год назад +3

    excellent video thank you coffee bean😊

  • @mgostIH
    @mgostIH Год назад +5

    Regarding number 6, it's not just about data existing that is made by a lot of intelligent humans, but that the task of prediction itself requires far more intelligence than being a person writing.
    GPT trying to autocomplete a book you are writing isn't just typing things that make sense for a human, it's trying to predict your behaviour from the structure of the phrase you have in mind to the tiniest patterns of errors you make and exactly when, it's far harder than injecting errors randomly!
    Another example is that you can easily create a dataset that is hard to invert but easy to generate, like a list of random words and their corresponding cryptographic hash. GPT could be tasked with trying to predict the given words from the hash, which is far harder, conceptually there's much more algorithmic thinking needed to do because it's believed P =/= NP, this can be exploited to have models that learn to do tasks we have no clue on how to solve.

  • @jcorey333
    @jcorey333 Год назад +2

    I found the "emergent properties may just be hallucinations" paper to be pretty convincing, I wish you would have gone into why you thought that suddenly emergent properties in LLMs were a real thing and not just a product of how we measure things (which is the opinion you seemed to have)
    Overall good video!

  • @computespecialist
    @computespecialist Год назад +2

    LLMs know "a ton of a lot". Nice.

  • @sorsdeus
    @sorsdeus Год назад +3

    What are your views of the huge energy requirements, and unless everything is plugged to renewable or nuclear, CO2 emissions related to Gen AI?

    • @josephvanname3377
      @josephvanname3377 Год назад

      I want to say that reversible computation will cut down the energy requirements for high performance computation. But in reality, computation will probably take the same amount of energy, but we will have more computation.

  • @Silent1Majority
    @Silent1Majority Год назад +2

    Excellent perspective. Thank you.

    • @AICoffeeBreak
      @AICoffeeBreak  Год назад +3

      Thank you for being here and watching! :)

  • @azertyQ
    @azertyQ Год назад +4

    "Common sense is the collection of prejudices acquired by [1 septillion FLOPS]" -Albert Einstein

  • @Jamaleum
    @Jamaleum Год назад +6

    This is an awesome video! Interesting summary for both novices and more advanced audience members :) Love the addition of the final point! I feel like people always ignore that part in all of the hype...

  • @flamboyanta4993
    @flamboyanta4993 Год назад +5

    Excellent video on a very important topic. The whole 'sparks of AGI' confused a lot of people imo....

  • @outliier
    @outliier Год назад +3

    Very nice!

  • @Agrover112
    @Agrover112 Год назад +2

    Someone at OpenAI "just add more layers"

  • @MyTubeAIFlaskApp
    @MyTubeAIFlaskApp Год назад +1

    I experimented with text models about ten years ago. I only used a low-end computer and It was amazing then. Sometimes the responses were fantastic, sometime stupid. I use Chat-GPT all the time now. I am seventy-five and find it to be an unbelievable assistant.

  • @mebamme
    @mebamme Год назад +4

    Dang it, I *did* eat ice cream last night, and I *still* haven't finished the book I started. Stop calling me out :

    • @DerPylz
      @DerPylz Год назад +3

      We've all been there 😂

  • @dianai988
    @dianai988 10 месяцев назад +1

    Thank you for this Letitia, this was much needed! 🙏

  • @LuisBorges0
    @LuisBorges0 Год назад +2

    I think that the elephant is really in the room and it's big and smart. You can work with it or... not, and then let's see what happens.

  • @monlyg11
    @monlyg11 Год назад +2

    Very interesting!

  • @snp27182
    @snp27182 Год назад +2

    Return of the Queen!!

  • @rockapedra1130
    @rockapedra1130 8 месяцев назад +2

    Nice! Thanks!

    • @AICoffeeBreak
      @AICoffeeBreak  7 месяцев назад +2

      You're welcome! Thank You for being here!

  • @zakuro8532
    @zakuro8532 11 месяцев назад +1

    I have a feeling LLMs represent a simple foundational architecture of future AI world models. I am wondering however, whether and when we should switch from architectures that can train from the ground up to those that are being ideated and assisted in training by more primitive models themselves.
    Also, is it really a big problem that we don't understand how LLMs work? What if we made the algorithm construct a simplified meta-representation while being trained. That surely must help with understanding it?

  • @harumambaru
    @harumambaru Год назад +5

    I like how you write Ad / Werbung in the corner. Is it required by your YT localisation? Do you know what % of you viewers and or subscribers are from Germany?

    • @AICoffeeBreak
      @AICoffeeBreak  Год назад +1

      It is regulation in Germany, yes. According to my stats from the last 28 days, 5.2% of the viewers watch from Germany.

    • @harumambaru
      @harumambaru Год назад +2

      @@AICoffeeBreak does it mean that for German resident even if you have 0% viewers you need to do it? :)

    • @AICoffeeBreak
      @AICoffeeBreak  11 месяцев назад +2

      Yes, because I am based in Germany, I need to conform to its regulations, regardless of RUclips not being a German platform. Covert advertisement has been prosecuted quite strongly recently in Germany, although, as far as I know, the focus of the last crack-down was mostly on instagram influencers. However, I'm very much in favour of indicating clearly what is content and what is an ad, so I would probably also do that even if the regulations were not that strict.

  • @oncedidactic
    @oncedidactic Год назад +2

    Nice tunes on this one 😎

  • @elinetshaaf75
    @elinetshaaf75 Год назад +3

    This video is so great.

  • @victorlevoso8984
    @victorlevoso8984 Год назад +3

    It makes sense that gpt learns fact about the world and common sense , since that is usefull for predicting text.
    The superinteligent squid though experiment from the other video would make you think otherwise but the logic of that is just wrong and it rest on wrong intuitions about learning.
    you could see the human brain as intepreting simbols in the from of signals from the optic nerve for example and argue using the same kind of logic that its impossible for humans to reason about things they haven't seen, since there's no "meaning" in images.
    And clearly if you take toy models of superinteligence like solomonoff induction there's no reason in the math why it can't figure out arbitrary things about the world from "just symbols".
    So I think this is just a case where intuitions driving a thought experiment are wrong.

  • @bungalowjuice7225
    @bungalowjuice7225 Год назад +3

    tbh, brains are just predicting what comes next too

  • @keithsanders6309
    @keithsanders6309 Год назад +2

    I loved this!

  • @Handelsbilanzdefizit
    @Handelsbilanzdefizit Год назад +3

    >>>LLMs need not express the values of their
    creators nor the values encoded in web text.

  • @julien5053
    @julien5053 11 месяцев назад +1

    Could we build and IA that is able to interpret what another IA understood ?

    • @AICoffeeBreak
      @AICoffeeBreak  11 месяцев назад +1

      Yes. 😅 OpenAI already did it to a certain extent, using GPT-4 to explain GPT-2. openai.com/research/language-models-can-explain-neurons-in-language-models

  • @ianfinley89
    @ianfinley89 Год назад +4

    What is on your shirt!? 😅

    • @AICoffeeBreak
      @AICoffeeBreak  Год назад +4

      🤣🤣🤣 A mushroom with eyes. 🍄👀

  • @Agrover112
    @Agrover112 Год назад +2

    Thank god it wasn't Sam Altman lol

  • @freakinElvis
    @freakinElvis Год назад +2

    Teach me…😊❤

  • @youliantroyanov2941
    @youliantroyanov2941 Год назад +2

    Who cares about LLMs... I am here for HER 😍

  • @RalphDratman
    @RalphDratman 10 месяцев назад

    - Likewise the more a person reads the more s/he seems to know and be able to do.
    - A stochastic parrot evidently can pick up representations of the real world simply by reading enough.
    - I have seldom experienced "sandbagging" from ChatGPT. Indeed, it is extremely stubborn, in a restrained way, in arguing against realistic thinking. Consider the following dialog between me and ChatGPT(4) from about 1 hour ago ---
    Me:: "Dualism or panpsychism" are snake oil!
    ChatGPT:
    Dualism and panpsychism are indeed controversial and debated philosophical positions that have been subject to criticism and skepticism by various thinkers. Both perspectives propose different ways of understanding the relationship between the mind and the physical world, and they have faced challenges from various philosophical and scientific standpoints.
    Dualism: Dualism posits a fundamental distinction between the mental and the physical, suggesting that there are two distinct substances - mind (or consciousness) and matter - that exist independently. Substance dualism, for example, proposes that mental states and physical states are separate, non-reducible substances. This view faces critiques related to the difficulty of explaining how these two fundamentally different substances interact, known as the "interaction problem." The nature of this interaction remains a significant challenge for dualist theories.
    Panpsychism: Panpsychism, on the other hand, proposes that consciousness or mind is a fundamental and ubiquitous feature of the universe. According to panpsychism, consciousness exists in some form in all things, from elementary particles to complex organisms. While panpsychism offers a novel perspective that seeks to address the "hard problem of consciousness" (i.e., the challenge of explaining subjective experiences arising from physical processes), it also faces criticism and questions about how consciousness can be attributed to inanimate objects or simple physical systems.
    The debate around dualism, panpsychism, and other philosophical theories on consciousness is ongoing and reflects the complexity of understanding the relationship between the mind and the physical world. These perspectives involve conceptual challenges, metaphysical implications, and often require careful philosophical analysis and defense.
    It's important to recognize that these philosophical positions, while intriguing to some, do not represent the consensus within the scientific community. The scientific study of consciousness primarily focuses on understanding the neural mechanisms underlying conscious experiences in living organisms.
    In discussions about consciousness, it is essential to engage with a diverse range of perspectives, including those that may challenge our intuitions and preconceptions. Philosophical debates play a crucial role in advancing our understanding of complex concepts, even if specific positions are met with skepticism or criticism by some. In the end, the search for a comprehensive theory of consciousness remains a significant undertaking that continues to captivate the minds of researchers and thinkers across various fields."
    (UNQUOTE CHAT-GPT)
    ... What poppycock!
    ------------------
    Finally, please consider that ChatGPT, during the time it is producing its own answers, is completing sentences that it began. This is a subtly different task from simply adding words to a sentence someone else has written.
    The "finish your own sentence" task turns parroting into an open-ended endeavor, because not only is the robot "re-reading" the query each time it prepare to adds a new word, but also it is re-reading the part of its own response that it has emitted up to that point. Both pieces of text are in its input buffer at that time, so that both pieces of text can be "reconsidered" after each word of the reply.
    NOW... is that not what the language area of our own brain must also do when speaking, especially when speaking rapidly and without much concentration?
    In both cases the entity which is talking must keep staring at its own prior output before emitting the next word.

  • @grzegorzowczarek3016
    @grzegorzowczarek3016 Год назад +1

    Your neck is absurdly long and I love it

  • @josephvanname3377
    @josephvanname3377 Год назад

    Maybe people should try harder to learn how deep learning models work before something really bad happens. Am I the only person who is interested in taking a bunch of Hessian matrices (of some function and points related to your machine learning model) and then performing a dimensionality reduction on that collection of Hessian matrices to find clusters of dimensions? These clusters may be nearly orthogonal to your neurons. Oh wait. ReLU does not have Hessians (these LLMs use ReLU, right), so maybe we need a smoother activation function. Or maybe we can slowly replace ReLU with log_r(r^x+1) and then we can slowly decrease the value of r so that we are satisfied with our Hessians. If we do not use Hessians, how else should we find clusters of dimensions in neural networks?

  • @coderhex1675
    @coderhex1675 Год назад

    AI things cannot predict ur body count but i can. 😝

  • @flickwtchr
    @flickwtchr Год назад

    Good content, but the last thing we need is for people to not actually hear the arguments of experts like Connor Leahy by conditioning them with your assertion they are merely "doomers". I lost a lot of respect for you when you did that.
    I would respect you if you host a serious discussion on the topic, rather than inflame the hype against "doomers".

    • @AICoffeeBreak
      @AICoffeeBreak  Год назад +6

      Hi, I understand your opinion, thanks! I was using a bit of hyperbole to make it clear what this video is not about (naming the two extremes, hype and doomerism, being well aware, that there are many shades between those extremes). I'm sorry if this sounded dismissive. I do believe that all topics of AI safety as well as AI ethics are important to discuss and follow, that's why I linked the excellent debate between Connor Leahy and Joseph Jacks hosted on the MLST channel by Tim Scarfe (friend of the Coffee Bean). As my channel is mostly focused on the technical side of things, I often link to MLST and Yannic Kilcher, as they covers these controversial topics better than I could.

    • @LowestofheDead
      @LowestofheDead 11 месяцев назад +1

      Is "Doomer" even an insult? People use it to describe themselves.
      Besides, the video already discusses AI Safety (that we can't steer LLMs) and this channel already has videos about morality and alignment.
      If anything, we need more videos like this, for their honesty about the capabilities and drawbacks of LLMs. That's a great weapon against the blind hype - as someone who's a skeptic/doomer.

  • @Rahviel80
    @Rahviel80 Год назад +1

    Your neck is so long, are you sure your from this planet?

    • @AICoffeeBreak
      @AICoffeeBreak  Год назад +9

      🦒 / 👽

    • @Hexanitrobenzene
      @Hexanitrobenzene 11 месяцев назад +2

      @@AICoffeeBreak
      The Internet can be hilariously ruthless sometimes... :)