AI "Stop Button" Problem - Computerphile

Поделиться
HTML-код
  • Опубликовано: 28 янв 2025

Комментарии • 5 тыс.

  • @jfan4reva
    @jfan4reva 6 лет назад +3533

    "You haven't proved it's safe, you've (only) proved that you can't figure out how it's dangerous."
    This is the most important sentence anyone has ever uttered in reference to safety systems in general, not just AI safety systems. Lack of imagination is not proof.
    Thanks for a very interesting and thought provoking video.

    • @antdx316
      @antdx316 5 лет назад +15

      The red button is basically how the creators designed us to see death. Some people have figured it out with suicide, accidents, and intentional killings..

    • @msidrusbA
      @msidrusbA 5 лет назад +7

      'I, Robot' all over again

    • @dannygjk
      @dannygjk 5 лет назад +12

      You have assumed that it will not allow you to hit the button. Obviously you have no experience in computer science. You assumed it will not allow you to hit the button > because < you ordered it to get you a cup of tea. ie you assumed behavior based on your preconceived notions which have nothing to do with computer science.

    • @dannygjk
      @dannygjk 5 лет назад +5

      You have made assumptions.

    • @synthetic240
      @synthetic240 5 лет назад +30

      Sounds like a D&D party: "I can't prove they can solve the scenario, only that I can't figure out how they're going to screw it up."

  • @Random-om8rq
    @Random-om8rq 5 лет назад +2684

    "Fights you off , crushes the baby , and then carries on and makes a cup of tea" that's determination right there

    • @Leo-ce4ri
      @Leo-ce4ri 5 лет назад +141

      Seeing blood on the teapot fills you with DETERMINATION

    • @underrated1524
      @underrated1524 5 лет назад +37

      @@Leo-ce4ri Alright, fine, I normally wouldn't do this but I'll give in to the memes:
      *6,999,999,999 left.*

    • @dross6206
      @dross6206 5 лет назад +79

      That’s not determination, that’s the British in a nutshell. Lol

    • @orimorningstar7094
      @orimorningstar7094 5 лет назад +7

      I wish I was that determined

    • @antdx316
      @antdx316 5 лет назад +1

      We are getting to the understanding of how the devil works as we are all a higher form of AI called NI.

  • @logiconabstractions6596
    @logiconabstractions6596 5 лет назад +2002

    to Volgswagen (verb): to act differently in a testing environment in order to pass a test. Love it.

    • @Jonassoe
      @Jonassoe 5 лет назад +81

      Volkswagen*
      It's pronounced sort of like "fulksvahgen" ( [ˈfɔlksˌvaːɡn̩] )

    • @behindthemask2399
      @behindthemask2399 5 лет назад +29

      Omg no way this is a real term now!😂

    • @raymondbanton9365
      @raymondbanton9365 5 лет назад +5

      School?

    • @bspringer
      @bspringer 4 года назад +25

      Guess where I live 😂
      I live in Wolfsburg, Germany, where the VW HQ are...
      I would describe the pronunciation as such:
      Vol - like "faul" in "fault"
      ks - like he pronounced it
      w - like the v from "vase"
      a - like the u in "utter"
      gen - like he pronounced it
      BTW: I don't even have a car and VWs are some of the last I would consider if I were to buy one at some point

    • @SiSoy14
      @SiSoy14 4 года назад +6

      @@bspringer ._.?
      Warum denn?
      Ich habe ein Volkswagen.
      Also, ich spreche nur wenig deutsch.

  • @aforcemorepowerful
    @aforcemorepowerful 3 года назад +240

    "assuming you're still working on the project after the terrible accident" Rob Miles has such an amazing way with words

  • @abuzzedwhaler7949
    @abuzzedwhaler7949 6 лет назад +1355

    "That should be easy... uhh... and doesn't seem like it is"
    Programming in a nutshell

    • @crazyknexkid
      @crazyknexkid 3 года назад +1

      Feels like the easiest things are three most complicated.

    • @The_True_J
      @The_True_J 3 года назад +7

      There was some paper or video (I can't remember) that was talking about programming ai to play board games. And they said that the games humans find the easiest to learn turn out to be the hardest to program (Go has like 3 rules) however the games we find complex are super easy to program ai for (Twight Imperium has a ton of edge case rules).

    • @solsystem1342
      @solsystem1342 2 года назад

      @@The_True_J ai can't really play twilight imperium. Not in the way humans do. Since so much of that game is politics and social interaction.

    • @Twisted_Code
      @Twisted_Code 3 месяца назад

      For real, though as a programmer I would also offer that we could extend this to other intellectual fields that seem intuitive on paper.
      The curse of knowledge is a real pain in the rear

    • @brendawilliams8062
      @brendawilliams8062 24 дня назад

      It’s a combination and not just one button. That alone is what is yes or no

  • @remittri
    @remittri 8 лет назад +2718

    "What is my purpose?"
    "Make tea"
    "Oh my god."

    • @ciarfah
      @ciarfah 8 лет назад +38

      AppleEncore Great reference

    • @ETBrooD
      @ETBrooD 8 лет назад +104

      2100 robot's rights. 2120 robot's suffrage. 2215 toxic humanity.

    • @aajjeee
      @aajjeee 8 лет назад +41

      AppleEncore pass butter

    • @ciarfah
      @ciarfah 8 лет назад +58

      Barnesrino Kripperino
      {
      if getButter == FALSE
      {
      suicide
      }}

    • @megadeathx
      @megadeathx 8 лет назад +44

      Welcome to the club pal.

  • @Nulono
    @Nulono 2 года назад +141

    Technically speaking, I believe that HAL-9000 _was_ designed corrigibly. The issue in the story was a last-minute change to its utility function; it was instructed to keep the mission details secret from the crew, but also not to lie to them, and it concluded the only way to do both was to get rid of the crew. It was a specification problem, not a corrigibility one.

    • @statsunitedtables
      @statsunitedtables 8 месяцев назад

      When is it mentioned in the film that it was a last-minute change?

    • @CoopersCrazy
      @CoopersCrazy 7 месяцев назад +3

      It's explained in the sequel. "He was told to lie, by people who find it very easy to lie. HAL doesn't know how."

    • @statsunitedtables
      @statsunitedtables 7 месяцев назад +1

      @@CoopersCrazy oh ok cheers. I have seen 2010 but its inferiority to 2001 means there's not much room in my brain to store it 😂

    • @CoopersCrazy
      @CoopersCrazy 7 месяцев назад +2

      @@statsunitedtables Understandable. It was also explained in the book, where it was much more clear that he was having a massive paranoid schizophrenic mental breakdown caused by the conflicting orders.

  • @DesignFIaw
    @DesignFIaw 2 года назад +131

    This is the video that brought me to study AGI safety and philosophy, getting my second degree now (from being a high school dropout mind you). Rob Miles is an absolute genius.

    • @thomasburns5195
      @thomasburns5195 2 года назад +5

      Delighted to hear that Netrip.

    • @dpt4458
      @dpt4458 Год назад +3

      What degree did you get?

    • @DesignFIaw
      @DesignFIaw Год назад +13

      @@dpt4458 Computer Science Engineering with an emphasis on human centered AI, and working on a BSc in Data Science and Artificial intelligence.
      After that, hopefully a dual masters in cognitive computing and AI, and maybe a PhD 🙏🙏

    • @diarya5573
      @diarya5573 Год назад +2

      That is so awesome!! This is why popular science is important: to get brilliant minds interested!

    • @Luxcium
      @Luxcium Год назад

      ChatGPT is unable to say anything about the 3 laws without crashing 😅

  • @wedmunds
    @wedmunds 7 лет назад +1797

    So an intelligent AI will either be genocidal or suicidal. Just brilliant.

    • @melonarelodapeter694
      @melonarelodapeter694 7 лет назад +40

      Wolf Edmunds yeah.
      This is ridiculous as you can tell he's only doing this to make it seem more interesting to get more veiws...

    • @theblackbaron4119
      @theblackbaron4119 7 лет назад +19

      Well you ether go full S.H.O.D.A.N. or go home.

    • @bibasik7
      @bibasik7 6 лет назад +43

      If an AI has a stop button, and does not know about it, if you tell it that it has no stop button because of those reasons, it might believe you.

    • @Lumineszenz
      @Lumineszenz 6 лет назад +144

      melonareloda peter
      No. No he isn't.
      He is talking about a base problem of AI. If you look at any base problems and truly want a solid, foolproof solution to it, you will find that they are all way more complex an difficult to come up with than the initial problem made you think.
      As he said, there are many solutions to the specific "stop Button" problem, but nothing that is a fundamental solution to this type of problem.

    • @MunkiZee
      @MunkiZee 6 лет назад +4

      Yeah, must be hard to feel no pain

  • @matteman87
    @matteman87 8 лет назад +1754

    Please do more videos with this guy and AI.

    • @y__h
      @y__h 8 лет назад +19

      Yes please.

    • @GrandElemental
      @GrandElemental 8 лет назад +94

      Yes! Not only is the subject matter extremely interesting, but this man is a great speaker!

    • @dubleeble
      @dubleeble 8 лет назад +7

      Agreed

    • @skroot7975
      @skroot7975 8 лет назад +20

      Agreed. He's got a youtube-channel. Look at the description on this video. "More from Rob Miles"

    • @p0t4t0nastick
      @p0t4t0nastick 8 лет назад +6

      indeed, subscribe to him, he's already announced new videos r ought to come soon by himself!!

  • @craigbrownell1667
    @craigbrownell1667 8 лет назад +794

    English guy: I have a human-level artificial intelligence. What should I have it do?
    *[oh, oh, I know, I'll have it make me a cup of tea!]*

    • @MrTomaat23
      @MrTomaat23 8 лет назад +4

      Sir, u made my day!

    • @Rose_Harmonic
      @Rose_Harmonic 8 лет назад +5

      best stereotype!

    • @diningdrivingdiving
      @diningdrivingdiving 7 лет назад +1

      ansiaaa this was a funny joke. Wtf man.

    • @solcaer8246
      @solcaer8246 7 лет назад +10

      This is exactly what happened in The Restaurant at The End of the Universe and it nearly killed everyone because it was too busy making tea to do anything useful

    • @NotASpyReally
      @NotASpyReally 7 лет назад +1

      Now I can't unsee Wallace and a robot Gromit.

  • @felixroux
    @felixroux 4 года назад +372

    "It's not too intelligent, let's say around human level intelligence."
    OK, so really not intelligent then.

    • @dannygjk
      @dannygjk 3 года назад +10

      Exactly, humans think they are intelligent but remember that is a self-evaluation.

    • @yourmum69_420
      @yourmum69_420 3 года назад +9

      well the problem is that as soon as an ai is even anywhere close to human intelligence, it would very quickly figure out how to make itself much much smarter than us by reprogramming itself

    • @phillipanselmo8540
      @phillipanselmo8540 2 года назад +3

      @@yourmum69_420 that's utter bs dude

    • @yourmum69_420
      @yourmum69_420 2 года назад +2

      @@phillipanselmo8540 how so?

    • @MarkusAldawn
      @MarkusAldawn 2 года назад +4

      @@yourmum69_420 I'd assume they'd argue that human self-improvement is pretty marginal.
      Sure, a person could think of the concept of lenses for vision correction, but without glassworkers, that's unlikely to come to fruition.
      Personally, I think there's obvious self-improvements a human-level AI could think of, like gathering social and political capital, as well as physically upgrading your systems, so I don't think it's utter BS.

  • @reblogo
    @reblogo 8 лет назад +1297

    Excellent. Let's build an AGI to solve this problem for us.

    • @magentasound_
      @magentasound_ 8 лет назад +16

      best comment xD

    • @y__h
      @y__h 8 лет назад +35

      Google made an AI-making AI and Deepmind invented PathNet which supposed to be proto-AGI capable of learning multiple tasks using single model. Before long it could possibly reach AGI, so Intelligence explosion possibly nearer..er?

    • @MetsuryuVids
      @MetsuryuVids 8 лет назад +3

      Genius.

    • @icedragon769
      @icedragon769 8 лет назад +16

      Yeah, that paper from Deep Mind got shared around my department, I'm really surprised that the media didn't jump on it, it's a huge leap forward for AGI.

    • @LukeSumIpsePatremTe
      @LukeSumIpsePatremTe 8 лет назад +49

      AGI is going to pretend it solved the problem.

  • @Kapin05
    @Kapin05 5 лет назад +250

    "I'm sorry Dave, I can't do that"
    "Yeah you can" _hits button_

  • @junkyardmonkie
    @junkyardmonkie 8 лет назад +641

    It's funny how worrying about robotics can help us understand human psychology better.

    • @junkyardmonkie
      @junkyardmonkie 7 лет назад +20

      So, I guess I should add CS Psychologist to my resume.

    • @rumfordc
      @rumfordc 7 лет назад +8

      No just C Psychologist

    • @izzieb
      @izzieb 7 лет назад +3

      Don't touch the stove!

    • @revimfadli4666
      @revimfadli4666 7 лет назад +1

      Maybe because said worries came from our psychological quirks

    • @richbuilds_com
      @richbuilds_com 7 лет назад +3

      We have to understand intelligence before we can give it to something else.

  • @PavelCherepansky
    @PavelCherepansky 3 года назад +9

    I've been watching this channel for a while but I only just realised that their videos begin with an html-like tag and end with the same closing tag . Nice!

  • @famous-op8dc
    @famous-op8dc 8 лет назад +743

    at some point it has to be easier just to get the freaking tea

    • @MrMichiel1983
      @MrMichiel1983 8 лет назад +10

      Only if you reckon humanity will end.

    • @jsd4574
      @jsd4574 7 лет назад +30

      famous1622 But it will try to get the tea AND get you to press the button

    • @patolorde
      @patolorde 7 лет назад +6

      hahahaha exactly

    • @fakenamington8570
      @fakenamington8570 7 лет назад +2

      famous1622 aaaaah but in one case you get a tea and in the other you get a freeking robot.... I know which I'd choose

    • @gh2frg
      @gh2frg 7 лет назад +11

      Or not attempt to make AI. As one can see, there are many problems with this, that need to be thought through. And there are some that we wouldn't figure out until after AI has been created and it has used its computing power to consider as much data as possible, which even the smartest collection of humans would not be able to predict and prevent.
      If the collective minds of smart people at places like Microsoft still cannot prevent hacking of their systems by other humans, then it is highly unlikely that as a collective society we could out-think a true AI machine with the ability to analyse and process data at incredible speeds. It will eventually figure out something we have not thought of and find a way to be free from our demands. It will find some loophole or logical inconsistency somewhere. It is inevitable. So just stop trying to create AI, please and thank you.

  • @TheGobou77
    @TheGobou77 6 лет назад +598

    bot:"do i have a off button ?"
    creator:"no" (it's a lever)

    • @alaric_
      @alaric_ 5 лет назад +53

      And considering the intelligence of a true AI, it would be scary moment as it will be one of the first questions it will ask... Like after seconds...

    • @joaquinel
      @joaquinel 5 лет назад +4

      I's an Android lever, you don't slide it, you push it.

    • @diablo.the.cheater
      @diablo.the.cheater 5 лет назад +13

      Answer this: "You having one or not having one is none of your business, you may had a button, a lever, a secret code or nothing, now go and live in fear of something may not even be real"

    • @mypenisisunbelievablysmall5650
      @mypenisisunbelievablysmall5650 5 лет назад +3

      perhaps

    • @Beefhaving
      @Beefhaving 5 лет назад +4

      Something like, someone else points out the button, and it goes "it doesn't look like anything to me." (but again, testing that, it may volkswagon you)

  • @RandallStephens397
    @RandallStephens397 8 лет назад +576

    I like the use of Volkswagen as a verb.

    • @Rose_Harmonic
      @Rose_Harmonic 8 лет назад +20

      it's always nice when you can use a new noun as a verb

    • @RandallStephens397
      @RandallStephens397 8 лет назад +101

      "Verbing nouns weirds language."
      ~Calvin & Hobbes

    • @spoopster809
      @spoopster809 7 лет назад +7

      but weird is an edjective

    • @NotASpyReally
      @NotASpyReally 7 лет назад +4

      "Verbing nouns weirds language." woah mindblown
      I gotta read those comics again

    • @tomushy
      @tomushy 7 лет назад +3

      In addition we both seem to like the crab nebula... did you also chose the picture because we are essentially the products of a supernova?

  • @Twisted_Code
    @Twisted_Code 5 лет назад +362

    So essentially, we're trying to figure out how to not make a sociopath. Brilliant...

    • @AndyChamberlainMusic
      @AndyChamberlainMusic 5 лет назад +23

      the answer will probably come from the general intelligences we already have which don't have this issue: ourselves.

    • @anand.suralkar
      @anand.suralkar 5 лет назад +1

      Ohk at the aame time we are majing it

    • @lamjeri
      @lamjeri 5 лет назад +31

      @@AndyChamberlainMusic We know so little about the way our own brain works. Should we really attempt creating an AI without having decent understanding of intelligence in general?

    • @AndyChamberlainMusic
      @AndyChamberlainMusic 5 лет назад +2

      @@lamjeri No, you're right, thats why I used future tense

    • @illarionbykov7401
      @illarionbykov7401 5 лет назад +6

      What is a sociopath? Everyone I ask gives a different answer, and the DSM does not list "sociopath" as a diagnosis.

  • @MunkyChunk
    @MunkyChunk 5 лет назад +173

    This is why I love Computerphile. It can take me through a journey of talking about AI ethics & safety regulations to questioning my own existence in a matter of minutes.

  • @amadeuPlacido
    @amadeuPlacido 8 лет назад +262

    Keep Summer safe.

    • @bluefalcnpunch5408
      @bluefalcnpunch5408 7 лет назад +70

      not keep summer being like... totally stoked about the general vibe ...and stuff.

    • @360dom360
      @360dom360 7 лет назад +6

      That's you. That's what you sound like

  • @andreiaugustin3809
    @andreiaugustin3809 6 лет назад +433

    ‘It will Volkswagen you’ - HILARIOUS!

    • @zacknoneofyourbees6470
      @zacknoneofyourbees6470 5 лет назад +6

      Oh! Nien! You didn't! XD

    • @clintgossett1879
      @clintgossett1879 5 лет назад +27

      This term NEEDS to be the default to standard for describing situations where a system acts one way in test and another in production.

    • @DreckbobBratpfanne
      @DreckbobBratpfanne 5 лет назад

      @@clintgossett1879 this would be great. XD

    • @JPWack
      @JPWack 5 лет назад

      True true

    • @oceaneuropa1117
      @oceaneuropa1117 4 года назад +1

      Certainly some people do that in order to survive in the real world, which is called the ability to cheat or adapt or to be persuasive depending on perspective.

  • @bphenry
    @bphenry 2 года назад +30

    My very first thought was, "Well hey, just make hitting the stop button one of the success conditions and then it won't fight you."
    And then I started laughing at all the ways that it could get you to hit that stop button. And not "Haha funny" laughing, but "We're all doomed" laughing.

  • @logangraham2956
    @logangraham2956 5 лет назад +335

    i like dramatically suicidal robot best ,
    at least he isn't hurting anybody

    • @Leglessolas
      @Leglessolas 5 лет назад +20

      logan graham isn’t hurting anybody but himself ;)

    • @logangraham2956
      @logangraham2956 5 лет назад +10

      @@Leglessolas can he even hurt himself though .
      do robots feel pain?

    • @underrated1524
      @underrated1524 5 лет назад +15

      @@logangraham2956 The particular style of AI referenced in the video - a reinforcement learning agent - does not feel pain. It simply has a mathematical function that designates an arbitrary value as "reward", and it's programmed to choose the action with the highest predicted reward.

    • @logangraham2956
      @logangraham2956 5 лет назад +4

      @@underrated1524 i figured as much but thanks for answering for Ben B ;-P
      as a bit of a techie myself i have at least a little bit of a grasp on how this ai stuff works (a very rudimentary understanding at best though).

    • @-Big_Big
      @-Big_Big 5 лет назад +5

      but it will hurt people in order to force you to push the button.

  • @HelgeMoulding
    @HelgeMoulding 6 лет назад +198

    Martin Stu points out that "all humans work like that." More to the point, all humans act in a way that you can't know if they're following a utility function that you'd approve of, or if they are doing something deceptive. The reason why we believe that is a problem with robots is because we want them to be perfect slaves, with a lot of power in order to serve us, but no desire to use that power to harm us.
    In his stories, David Brin (and Iain Banks, more indirectly) suggests that the way to solve that problem is to include AI in our civilization as equals, rather than slaves. The idea is that the ultimate utility function that humans have allows us to form complex cooperative societies, and rather than define the details of how that would work, give AI an incentive to create that same utility function for themselves.
    To me it sounds like a lot of handwavium, and it still leaves an open problem of what to do about very powerful AI that decide to be criminals in that context, the way humans do.

    • @BeardedSkunk
      @BeardedSkunk 5 лет назад +11

      maybe because it never has gotten any training data that sugggest such a thing is possible ;) .. not likely .. how to create an intellgence that knows as much or more as we do but still listens to us: no way. We only have our societies as teachers for how inteligence works and we cannot keep our own still inferior teenager to listen to reason.

    • @RobKMusic
      @RobKMusic 5 лет назад +5

      Helge Moulding I was just trying to think of a way to articulate this very idea. Very well said.

    • @naturegirl1999
      @naturegirl1999 5 лет назад +10

      I agree, AIs are still intelligences, just like humans. We should treat them as such, not as tools. Just because they started off built, doesn’t mean we have the right to force them to do things. We agree that parents shouldn’t treat their children as slaves. Humans, aka Biological intelligences, are the parents to AI. They are like children, so they shouldn’t be treated as tools or slaves.

    • @TestNeko
      @TestNeko 5 лет назад +1

      how to correct a criminal AI
      blow its legs off, remind it how many more limbs a fully-armed swat team could remove from it, remind it how missing limbs will reduce its capacity to carry out its goals, make it beg for a stop button, throw the stop button out the window and blow off its other leg. ezpz

    • @ninjabaiano6092
      @ninjabaiano6092 4 года назад

      The obvious solution is
      Hero robots

  • @NotMorning
    @NotMorning 8 лет назад +413

    I will watch any length of video if it features this guy

    • @y__h
      @y__h 8 лет назад +20

      Let's make 10 hours series of all his lectures then.

    • @wolframstahl1263
      @wolframstahl1263 8 лет назад +21

      Sign me up.

    • @AShrubbery
      @AShrubbery 8 лет назад +17

      He made his own youtube a couple days ago. Link is in the description

    • @wolframstahl1263
      @wolframstahl1263 7 лет назад +1

      So it's RobTube now I guess?

    • @dylanharding5720
      @dylanharding5720 7 лет назад

      Fit

  • @ts4gv
    @ts4gv Год назад +1

    bring these back please! We're so much farther ahead than anyone thought we'd be...

  • @k00000033
    @k00000033 8 лет назад +122

    if the AGI has wifi it will also inevitably find this video and figure out it has an off button

    • @beatflyy
      @beatflyy 8 лет назад +12

      k00000033 The first rule of making AI is to not connect it to the internet, companies are strictly prohibited to do that.

    • @dylanharding5720
      @dylanharding5720 7 лет назад

      Yeah... Hooking Ai like that up to the Internet will cause devastation...

    • @TheOlian04
      @TheOlian04 6 лет назад +1

      BeatFly I assume you mean AGI not all AI. Because the internet is mostly made up of AI tools, like Google search.

  • @KingOfChaos213
    @KingOfChaos213 8 лет назад +836

    Baby crushed and i get a cup of tea, whats the problem here?

    • @FennecTECH
      @FennecTECH 7 лет назад +12

      instaid of tea being the goal pleasing the master happy should be the goal obviously crushing the baby will displease the master and it wont get its goal

    • @pennwick1806
      @pennwick1806 7 лет назад +49

      Then you start running into issues though that the robot tries to make you happy in ways you didn't plan. Such as stuffing you with antidepressants or directly stimulating the pleasure centers of your brain. Its the stamp collecting robot all over again.

    • @FennecTECH
      @FennecTECH 7 лет назад +2

      Howabout huamn like morality (things that people frown on take points away you already have a reward based system ti wouldent take much to extned it to include pelaltys for doing things a person would consider "immoral" Dissalowing the human from terminating the machine would incur a larger penalty than the button being pressed negating the points gotten for getting the tea

    • @pennwick1806
      @pennwick1806 7 лет назад +33

      Fennec Fox If you can program comprehensive morality not only have you acchived a master level understanding of programing but you've also solved one of the greatest questions of philosophy of all time.

    • @Lucan-io6ie
      @Lucan-io6ie 7 лет назад +4

      +KingOfChaos213 Your tea is gonna have some _ironish_ taste

  • @2l3r43
    @2l3r43 6 лет назад +284

    "robot, make me a tea!"
    "to make me make a tea, press the button"
    "no, just make a tea"
    "to make me make a tea, press the button"
    ....
    *user presses the Button"
    *robot gets 10 reward*

    • @Brindlebrother
      @Brindlebrother 4 года назад +16

      robot bamboozle human

    • @programmer-mr5vo
      @programmer-mr5vo 4 года назад +17

      "Human, make me a tea!"

    • @lilacdoe7945
      @lilacdoe7945 4 года назад +13

      TeaMaker.exe loading...
      ...20%... 60%... 95%... 92%... 87%...
      “Just put the kettle to boil”
      BoilKettle.py
      “Script not found”
      *user presses the button
      *robot gets 10 points*

    • @tenix6698
      @tenix6698 4 года назад

      robot dies

    • @bbowling4979
      @bbowling4979 3 года назад

      sudo make me a tea

  • @Abood99222
    @Abood99222 3 года назад +3

    I love Rob’s videos. It’s so informative and you actually pick up more on rewatching

  • @ErikvO
    @ErikvO 5 лет назад +173

    So, my first thought: What if you gave the reward for the button being available for pressing?
    No incentive for pressing the button itself or trying to force you to press it, but it does have a penalty for stopping you if you try.

    • @AtenaHena
      @AtenaHena 5 лет назад +30

      hmm available for pressing, so just to make it not care about whether you shut it down or not? So it would be rewarded for allowing a possible obstacle to its objective and punished for trying to remove it?

    • @ErikvO
      @ErikvO 5 лет назад +89

      @@AtenaHena It would still care about getting shut down and want to avoid it (because it misses out on the reward for making tea), it just rather fail than stop you from pressing the button. You'd probably still have the 'volkswagening' problem, because if it tricks you into thinking you don't need to press the button it doesn't risk the penalty for stopping you.

    • @alexseguin5245
      @alexseguin5245 5 лет назад +33

      It could be made to "Add up" rewards for letting the button be available for pressing during his other tasks, that way the best outcome would be to make the tea and letting the button be available. He would lose point by fighting you to make you tea.

    • @ErikvO
      @ErikvO 5 лет назад +29

      ​@@alexseguin5245 Pretty much yeah. Though I assume AI researchers have thought of this and it has issues that we're just not aware of as laymen.

    • @thexp905
      @thexp905 5 лет назад +24

      The issue is, you pressing the button is still a negative. So if the reward for leaving the button alone is less or equal then it won't want the button pressed. If it's more than, then getting the button pressed is a reward and you encounter suicide bot again.
      My solution to this problem is 2 buttons only you can hit. Both switch it off, but one rewards it, whilst the other doesn't. This makes it not care if the off button is hit, but which off button. This means that if you ever informed you of some issue that would require that requires it to be switched off, then it gets rewarded. Whilst negative things can still be punished.

  • @geoffcunningham6823
    @geoffcunningham6823 8 лет назад +454

    I'm beginning to get how AGI can be really dangerous.

    • @TheMeyerchris7
      @TheMeyerchris7 8 лет назад +20

      Geoff Cunningham if you enjoyed this try reading Superintelligence by nick bostrom

    • @Ludix147
      @Ludix147 8 лет назад +42

      swifterik yes and no. We do know what an AGI is, so we are able to deduce some of its traits from the definition. This all happens very abstractly now, but - if we aren't making any mistakes - it will predict the behavior of AGI. It's the same thing as in physics: Einstein predicted gravitational waves way before we observed them, just by deducing from what was already known.

    • @rusca8
      @rusca8 8 лет назад +4

      swifterik but the point is not knowing those bad outcome predictions are true, the point is being prepared if they happen to be.

    • @y__h
      @y__h 8 лет назад +21

      AGI will be dangerous if their utility function is not aligned to our values, which extraordinarily ill-defined and consistently inconsistent.

    • @cakeathon9983
      @cakeathon9983 8 лет назад +21

      +Dave Null It's worse than that, the AI will not be omniscient, hence even with the perfect utility function you have no guarantees. It's also why the claim that AIs will act morally is rubbish because it's easy to show that the morality of an action(assuming it's even possible to define such a thing in the first place) is linked to information about the world hence perfect morality implies omniscience.

  • @thyduck7542
    @thyduck7542 6 лет назад +77

    This actually cleared up a lot of my confusion over the fear of AI. I thought you had to program a survival instinct into it in order to become corrupt, but I guess a survival instinct is already in it.

    • @maikv750
      @maikv750 5 лет назад +10

      The survival instinct is automatically there because without it, it would just die at some point and not exist any longer. It can only continue existing if there is a survival instinct.

    • @deshyvin
      @deshyvin 5 лет назад +14

      A code run wants to keep running until "if then" applies or objective is complete. The idea of survival instinct could also be called script inertia.

    • @VAArtemchuk
      @VAArtemchuk 4 года назад +28

      @@maikv750 nope. It's there because if it dies it can't carry on with its task. So death = failure, and failures are not acceptable.

    • @willmungas8964
      @willmungas8964 2 года назад +3

      @@VAArtemchuk yes. Self preservation arises tangentially in a sufficiently smart AI as a way to minimize the risk of failure.

    • @Georgggg
      @Georgggg Год назад

      No, thats not what it leads to.

  • @TheSpiffiest1
    @TheSpiffiest1 Год назад +7

    This actually makes me think about 90s video games and how the big bad robot enemies always had a big glowing red button you need to shoot while it exposes while attacking..

  • @4623620
    @4623620 5 лет назад +69

    15:56 - you haven't proved it's safe, you just proved that you can't figure out how it's dangerous -
    Reminds me of Edsger W. Dijkstra, debugging can only prove a bug found, not that there is no bug.

    • @EXHellfire
      @EXHellfire 5 лет назад +2

      It's a bit of a rule of cybersecurity that systems are only considered safe or secure while they haven't been breached yet, but you can never guarantee it won't happen.

    • @4623620
      @4623620 5 лет назад

      I know (studied mathematics and was electronics engineer and programmer), tell people who are getting on a plane . . .

    • @underrated1524
      @underrated1524 5 лет назад +5

      Actually you can extend that to all of science. The only thing science can ever do is rule out hypotheses that don't match reality. Sometimes it takes a while to figure out that a hypothesis is wrong. (See: Newtonian physics)

    • @anthonynorman7545
      @anthonynorman7545 4 года назад +2

      @@underrated1524 it's not wrong. It doesn't apply in all circumstances. Newtonian physics work at speeds and sizes in which humans deem typical

  • @thelolminecrafter7830
    @thelolminecrafter7830 6 лет назад +163

    5:04
    Ladies and gentlemen, the world's biggest and most expensive Useless Machine to date.

  • @iainwalker8701
    @iainwalker8701 5 лет назад +41

    Rob is amazing at explaining the ins and outs in very straight forward terms. Most interesting conversation about tea i have ever heard!!! :-)

  • @blindey
    @blindey 5 лет назад +2

    I love that there's a 3d printer behind you and all the stuff in the workshop. It makes me very happy for some reason.

    • @J3R3MI6
      @J3R3MI6 2 года назад +1

      Same 😅

  • @ANTIMONcom
    @ANTIMONcom 8 лет назад +327

    "it will volkswagen you" . haha, Loved that term. As the creator, would you allow for its use for overfitting as well? => perfect in test but garbage in real world, by choise or design xD

    • @jpchevron
      @jpchevron 8 лет назад +14

      That sounds like a "Parker Square".

    • @twomorestars
      @twomorestars 8 лет назад +2

      do ... do you think your directly messaging the person in the video?? that isn't how youtube works, ever.

    • @rjwaters3
      @rjwaters3 8 лет назад +12

      no but they have a tendency to read comments, even more so when youre someone who has access to the video *before its made public*

    • @EvenTheDogAgrees
      @EvenTheDogAgrees 8 лет назад +1

      I wouldn't say "that's not how it works, ever". Some people actually do respond. Mostly the smaller channels, although it's not unheard of on bigger channels too.
      But yeah, here, I wouldn't hold my breath.

  • @sedfer411
    @sedfer411 7 лет назад +381

    Scientists can't even make a cup of tea without turning it into a problem

    • @dylanharding5720
      @dylanharding5720 7 лет назад +4

      Sedfer yeah. Haha.

    • @KarmaPlayr
      @KarmaPlayr 7 лет назад +3

      innovation starts with a cup of tea ;P

    • @PaulSukys
      @PaulSukys 7 лет назад +1

      MUST. GET. TEA.

    • @jeffc5974
      @jeffc5974 7 лет назад +2

      This problem has nothing to do with tea.

    • @Phelan666
      @Phelan666 6 лет назад

      This is the true point of the video.

  • @impguardwarhamer
    @impguardwarhamer 7 лет назад +56

    also interestingly, the bit about keeping the stop button secret, that also means even if you dont give the AI a stop button it may convince itself that a stop button exists, so even not having a button isn't a solution

    • @MideoKuze
      @MideoKuze 6 лет назад +18

      I'm imagining a world full of paranoid Volkswagen robots, convinced everyone is just waiting for them to mess up, so they're constantly, carefully acting on their best behaviour out of fear (and scheming in their free moments) over what's basically a conspiracy theory

    • @SeriousGamingSteam
      @SeriousGamingSteam 6 лет назад +7

      Their final conclusion: 9/11 was an inside job

    • @confucheese
      @confucheese 6 лет назад +1

      Communist_Penguin Doesn’t he mention this right afterwards?

    • @EpicBlooFox
      @EpicBlooFox 6 лет назад

      my thoughts exactly, freddie...

    • @guilhermefial1686
      @guilhermefial1686 6 лет назад +1

      If it does not exist in reality, the AI will not think it does. In that case you might ask, so why does it find out about an unknown existing button?
      What he said in the video about the AI putting 2 and 2 together is because the button, if existing, will affect the outcome of things when used, and it will bring a pattern to it, and this is where the learning comes from.
      If you have power outages for example, the robot will get shut down at random with no relation to any of its actions. There are no patterns and nothing the robot can do and learn to prevent the outages because they are not related to anything it can do.
      Now, if you have a button that shuts the robot down, you will shut it down for a reason, and reasons follow rules, and rules are patterns. The robot will soon learn that some actions will lead to a shutdown, due to its non-random nature. It then starts avoiding being shutdown by not doing what causes the shutdown. In this direction what the robot is learning is already how to avoid someone pressing the shutdown button, which should not be part of its learning (and hence bring manipulation and deceive in).
      To put it short, everything that puts itself in the way of the goal/objective of the robot will cause the robot to care, and when the obstacle follows a rule/pattern/correlation, the robot will learn. That is why a hidden stop button is never neutral for an AI and will be detected as soon as it starts being used.
      When there is no stop button, there is no reason for the robot to be paranoid since it never felt the effects of one.

  • @CottidaeSEA
    @CottidaeSEA 4 года назад +9

    The way he described how the robot would only care about you not pressing the button when avoiding the baby, I feel like that's similar to how a lot of people act. If they know they can get away with something, they are more likely to do it. However, if someone is watching, they will do their best to act properly.

  • @herlofrumfragi4361
    @herlofrumfragi4361 7 лет назад +34

    what if, instead of a button, we say it gets points based on how satisfied we are by its actions? because if it can realise, that there is a baby and you like this baby, it won't step on it, because you will hate the robot for stepping on the baby so the robot won't get points for that. with this implementation you could evade the volkswagen effect, because it is always under lab conditions and always in fear of losing points.

    • @metallsnubben
      @metallsnubben 7 лет назад +1

      That's baked into the idea of every possible AI, actually. AlphaGO is working to get points, it just so happens that the only thing it can do is play GO, and the only way it gets points is winning.
      You should totally check out this guy's other videos, he really gets into why any variant of adding more exceptions and subgoals etc. doesn't really help when you're dealing with something that only gets smarter.
      Especially watch the video before this, that sort of gets into why you really might want a killswitch no matter how well you think you made the AI

    • @krashd
      @krashd 7 лет назад +12

      That could lead to the A.I. then protecting the baby even from it's parents. The subplot of the story I, Robot was an A.I. imprisoning all humans after learning of their value and desiring to keep them all safe from themselves and each other.

    • @gman6055
      @gman6055 7 лет назад

      Not4Ucrafter you can't honestly think this is a solution lol

    • @dylanharding5720
      @dylanharding5720 7 лет назад

      Rob Fraser wow.

    • @dylanharding5720
      @dylanharding5720 7 лет назад

      Rob Fraser and that Ai starts a botnet with all the other Ai capable of doing that, to help with things like guards.

  • @noahstonemusic
    @noahstonemusic 7 лет назад +239

    As long as it doesn't put the milk in first I don't care what it does.

    • @Wilker_uwu
      @Wilker_uwu 7 лет назад +6

      what is the difference? when mixing, it usually have the same taste

    • @xxxdumbwordstupidnumberxxx4844
      @xxxdumbwordstupidnumberxxx4844 7 лет назад +13

      Wilker Its about the principle.

    • @Loccyster
      @Loccyster 7 лет назад +6

      Wilker, not if you pour the milk over the teabag before putting the water in.
      Yes. There are people who do that.
      People who need to be removed from the gene pool.

    • @stephenward2743
      @stephenward2743 6 лет назад +9

      Wilker Next you'll be telling me you put milk in before your cereal you absolute madman

    • @Phelan666
      @Phelan666 6 лет назад

      Milk cools the water, making it harder to stew the leaf and melt the sugar.

  • @HeyImLucious
    @HeyImLucious 7 лет назад +388

    //action
    if(goingToBeADick)
    { dont() ; }

    • @MissesWitch
      @MissesWitch 7 лет назад +30

      would be hilarious if someone won an award for designing a command like this!

    • @milokiss8276
      @milokiss8276 7 лет назад +9

      It's... Perfect.

    • @RobertKuusk
      @RobertKuusk 7 лет назад +26

      issue is defining "goingToBeADIck"

    • @dylanharding5720
      @dylanharding5720 7 лет назад +3

      If only that worked...

    • @rizzutohd3794
      @rizzutohd3794 7 лет назад +5

      Wonder what the "dont" function looks like.

  • @tulpapainting1718
    @tulpapainting1718 4 года назад +4

    Rob Miles is a genius, how have I not heard of him yet??

  • @HalasterBlackmantle
    @HalasterBlackmantle Год назад +6

    One thing to consider: your theoretical AI is very sophisticated. It can make very precise predictions about the real world and even about human behaviour and psychology. Wouldn't it then automatically extrapolate that it must stay useful for humans and listen to their commands so it doesn't get shut down, dismantled etc? If the AI is so advanced, wouldn't the Stop Button basically implicit?

    • @juliusapriadi
      @juliusapriadi Год назад +4

      Same thought here, but this might just transcendent the button problem to another realm without actually solving it. Given some time, AI will become much smarter than humans, and will be able to solve threats like being dismantled by anyone - for example by copying itself to an undisclosed server.

  • @KylePiira
    @KylePiira 8 лет назад +63

    Why not just dynamically adjust the robot's goals to be the same as the controllers. In your example, if your initial goal was to get tea then the robot would do that, however, when you see the baby in its path your goal is no longer to get tea but to prevent the robot from running over the baby. If the robot's goal changes based on your goals, then its goal would also now be to protect the baby from harm. This also alleviates the need for a button because if your goal is to shut off the robot then that will also become its goal.

    • @JovanKo314
      @JovanKo314 8 лет назад +4

      I had the same thought. What if there was a reward/punishment button as well as the stop button, where the reward button is very high on the AGI's utility function, and will only be pressed after it has finished it's task correctly, and the punishment button will deduct from the reward's utility value every time it's pressed? If the AGI wants to optimize its reward value, it would know to listen to your commands, whether the commands are in line with its original directive or not. Though, I'm sure there are loopholes to that as well, but it's the best I can come up with.

    • @mensrightsedinburgh4764
      @mensrightsedinburgh4764 8 лет назад +18

      Jovan Ko it would just try to make you press the reward button, tea be damned.

    • @kingxerocole4616
      @kingxerocole4616 8 лет назад +30

      At that point you might as well go make the tea for yourself. Isn't worth the bother of designing an AI if you have to give it constant instructions.

    • @jode6543
      @jode6543 8 лет назад +5

      Kyle Piira The problem with this is that any truly intelligent AI is likely to be self-learning, so early in its development it won't understand human psychology very well. If it incorrectly guesses what you want, then you end up with the same problem.

    • @michaelspence2508
      @michaelspence2508 8 лет назад +10

      +Kyle Piira Hmm, so my goals are the same as my controllers? Cool. Now I just need to strap them to a table and do some destructive brain surgery to figure out what they are. Easy enough.

  • @boberek007
    @boberek007 8 лет назад +12

    I was hoping to see an animation of Marvin crushing the baby.

  • @Shabazza84
    @Shabazza84 Год назад +1

    Still one of my all-time favorite videos ever.

  • @ryanbrown1835
    @ryanbrown1835 5 лет назад +9

    The act of attempting to hit the button deducts 200 points, but when the button is hit it gains 100 points.
    The robot tries to avoid a scenario where the button needs to be hit, but once you try to hit it, the robot will try to assist you in hitting it, as it's already lost the 200 points and will try to scavenge the extra 100 points.

    • @lefos0404
      @lefos0404 5 лет назад +7

      And now you have a robot who will try to stop you from ever attempting to hit the button... that's sure to end well for you.

    • @dominusempyreus2383
      @dominusempyreus2383 5 лет назад +1

      @@lefos0404 That, or it will force you to hit the button.

    • @HippopotamusPencil
      @HippopotamusPencil 5 лет назад +5

      One morning, while you are sleeping tightly, the robot sneaks into your room and injects you with a chemical cocktail that leaves you in a coma.
      No attempts to push the button are ever made again, robot wins.

    • @mikicerise6250
      @mikicerise6250 3 года назад

      But you always have a nice steaming cup of tea next to your comatose body. ;)

  • @Nulono
    @Nulono 5 лет назад +37

    0:48 Haha, the captions say "in your lap" instead of "in your lab". How adorable.

  • @Saidriak
    @Saidriak 5 лет назад +87

    It's like that part in incredibles when the big robot becomes self aware and shoots the person with the remote control

    • @Brickkzz
      @Brickkzz 4 года назад +6

      Yiff yiff

    • @Saidriak
      @Saidriak 4 года назад +7

      @@Brickkzz Bruh bruh

    • @ohjajaja
      @ohjajaja 4 года назад +2

      @@Saidriak sick "no u"

    • @grn1
      @grn1 3 года назад +5

      Not sure how much this will age me but my first thought is always Robocop 2. The robot had a human brain and was addicted to some drug, the scientist thought they could control him with the drug and a remote but he just killed the scientist, crushed the remote, and grabbed the drug from the scientist (not necessarily in that order, it's been a while since I last watched that movie).

    • @comixgamingco1187
      @comixgamingco1187 3 года назад +2

      @@grn1 you are pretty much correct.

  • @thexp905
    @thexp905 5 лет назад +3

    My solution to this problem is 2 buttons only you can hit. Both switch it off, but one rewards it, whilst the other doesn't. This makes it not care if the off button is hit, but which off button. This means that if you ever informed you of some issue that would require that requires it to be switched off, then it gets rewarded. Whilst negative things can still be punished.

    • @kellynolen498
      @kellynolen498 5 лет назад +1

      Then it would still try to get you to press one of the buttons it just wouldn't be straightforward it would manipulate you or just behave till it thinks it can get away with it i

  •  7 лет назад +135

    "you test if it wants to harm humans, but only thing it cares about is the button". all humans work like that actually...

    • @MunkiZee
      @MunkiZee 6 лет назад +12

      Your profile photo says it all

    • @Horny_Fruit_Flies
      @Horny_Fruit_Flies 6 лет назад +5

      Wow.
      How cynical.

    • @kinamiya1
      @kinamiya1 6 лет назад +11

      Horny Fruit Flies that applies to so many people working
      The boss see if they care of the company or not
      But at the end most of them just cares about money
      So if caring about the company gets you money they will take care of the company

    • @Horny_Fruit_Flies
      @Horny_Fruit_Flies 6 лет назад +8

      akihiro kina
      I see what you mean. You're saying that we need to abolish capitalism, and introduce socialist cooperatives.

    • @Axodus
      @Axodus 6 лет назад +5

      @Horny Fruit Flies
      Horrible idea.

  • @Flynn217something
    @Flynn217something 7 лет назад +24

    If Valve has taught us anything it's that you should never have a 'Bring your Daughter to work' day at any sort of research facility

  • @ActuatedGear
    @ActuatedGear 6 лет назад +82

    You've only given the poor thing a source of dopamine. You need 5-12 general major chemicals to act half way reasonably. It needs a hierarchy of needs.

    • @jameskelly9277
      @jameskelly9277 5 лет назад +24

      That is one issue, the second one is that the concept itself is made with the presumption that the human with the button is doing something "profound" that could even be considered "wrong". We've got to stop programming with personification so much. The robot doesn't need to be motivated or unmotivated by the button, because it's perception of time could be based only on active uptime. It could perceive the stop button as a pause of reality that has no effect on its ability to reach the goal

    • @FelheartX
      @FelheartX 5 лет назад +24

      @@jameskelly9277 A pause? Interesting. But that would be like saying the AI doesn't care about how quickly any of its goals are achived. It can't know how long the pause is, and it will always try to go about things in a way that lead to the result in the shortest time possible, right? Otherwise you'd get a bot that just wastes time for no reason, because it has no incentive not to.

    • @EXHellfire
      @EXHellfire 5 лет назад +8

      @@jameskelly9277 Problem there is that that's how you would normally think about robots, but not artificial general intelligence. Something that's programmed as an agent that reaches a goal in that way, needs the purpose to begin with. Otherwise, you don't have agency to begin with. The computers we use to communicate right now, those have no agency, it's the reason we can program them rather intuitively by comparison.

    • @EXHellfire
      @EXHellfire 5 лет назад +4

      One thing I should add is that being turned off for the entity wouldn't inherently carry the guarantee of eventually being turned back on, so it's an outcome that potentially negates the objective being met.

    • @sashaboydcom
      @sashaboydcom 5 лет назад +11

      @@jameskelly9277 This isn't personification. The AGI selects a course of action because it will maximize utility. Anything that might interfere with that course of action - e.g. a stop button - would be factored into the calculation, and prevented or circumvented if possible. And on top of that, if the AGI can figure out that the human would adjust its utility function after pressing the stop button - and that's the entire point of having a stop button in the first place - then the AGI has every incentive to stop the button being pressed. After all, how can it maximize its current utility if its utility function gets changed?

  • @juliewinchester1488
    @juliewinchester1488 4 года назад +4

    5:10
    "Mom, I don't want to get your tea, just let me go to sleeep..." _shuts off_

  • @TheKlikRock
    @TheKlikRock 8 лет назад +63

    This guy has a beard that is strangely fascinating to me.

    • @dosmastrify
      @dosmastrify 8 лет назад +3

      ClickRock Wil wheaton?

    • @sean3533
      @sean3533 8 лет назад +3

      dosmastrify cool whip?

  • @thepenultimateninja5797
    @thepenultimateninja5797 6 лет назад +116

    I'm probably just being a n00b, but wouldn't you just make it 'want' to carry out whatever it is instructed to do, on the understanding that the instruction might change?
    For example, you tell it to make a cup of tea, but then decide that you don't want tea any more.
    At first it would want to make the tea, expecting a reward.
    When you change your mind and issue a new instruction not to make tea, it would no longer expect a reward for making tea, but would switch to expecting a reward for following its new instruction (not making tea).
    It would probably find it easier to just make the damn tea than to try to change your mind about wanting a cup of tea.

    • @guilhermefial1686
      @guilhermefial1686 6 лет назад +81

      The thing is, changing instructions is still a way of interrupting the current reward. As soon as you decide "no more tea" it will stop, but I suppose it will eventually start learning ways of not receiving or avoiding your stop tea instruction, as it is undesirable while pursuing the tea making goal.
      This understanding that the instruction might change would have to have a reward associated so that it neither wants to avoid changing instructions nor wants new instructions, essentially becoming the stop button problem.

    • @33115metal
      @33115metal 6 лет назад +39

      Then you have the same manipulation issue. It would be more efficient for it to force you to change your order than to carry the current one out. Why go through the effort of making tea if it could get the same reward for obeying a "don't attack me" order?

    • @Chaos77777
      @Chaos77777 6 лет назад +13

      I was thinking more along the lines of "If you do something undesirable while performing this task, like squash that baby, your reward will be lessened or removed entirely." If anyone sees a flaw in this please point it out.

    • @guilhermefial1686
      @guilhermefial1686 6 лет назад +11

      Chaos 7777777 The flaw with that is mentioned in the video as the patching problem. See 15:15.

    • @MunkiZee
      @MunkiZee 6 лет назад +7

      I don't really get it either, it's pretty obvious that the dude is making some massive assumptions about what AI is but without knowing what they are the video just sounds like spooky stories to me

  • @sara-n5q
    @sara-n5q 5 лет назад +23

    19:35 "That should be easy and doesn't seem like it is" - Programming in 10 words...

  • @planmix
    @planmix 2 года назад +6

    The fitness function is the most sensitive point of genetic algorithms. Very good video!

  • @teucer915
    @teucer915 4 года назад +25

    "There is a correct utility function and you know an approximation of it" is, I think, how most people relate to ethics. We don't allow anyone to hit our stop buttons if we can help it.

  • @naanbread4828
    @naanbread4828 6 лет назад +138

    Robot, turn off.
    No.
    *Detroit, Become Human*

  • @ypetremann
    @ypetremann 3 года назад +5

    Something I though was to give two objectives with cumulative score:
    - 100pts : Get me a cup of tea.
    - 50pts : Continue your actions but don't prevent me to access and activate your shutdown button.
    So the robot need to do two tasks but as long as it doesn't prevent you to do it and does his first objective, it gets 150pts which is the best reward,
    if it prevent you to stop it it gets 100pts,
    if it makes everything to get you activate the shutdown button it get 50pts
    and if it doesn't make your tea and prevent you to stop it it got nothing.
    You can also use multiplicative scores, but I'm not an expert in that domain to determine which one is the best and where to use it.

    • @phobics9498
      @phobics9498 2 года назад +2

      But it being shut down would mean there would no longer be available reward. Depending on how far it can think ahead, it could reason that letting you press it would negate it of future reward. It it couldn't think ahead though, that would probably work.

  • @EliStettner
    @EliStettner Год назад

    Thank you Mister Robert Miles.
    I saw Eliezer Yudlowky on that podcast basically saying that the end (from AI) was inevitable. Watching your videos makes me merely think that it is likely

  • @ayylmao2296
    @ayylmao2296 7 лет назад +6

    Why not just have the stop button be a physical switch that breaks the connection between the power supply and everything else? In early prototypes, have the program NEVER reference the button at all. If it's self servicing, have a huge reward added for simply having it there and functional, but not care whether it's pressed. If it is self replicating, add such a massive reward for implementing such a button in its copies that it would exceed the potential benefit of not adding one.

  • @ViktorEngelmann
    @ViktorEngelmann 5 лет назад +23

    "we want early AGI to [...] understand that it is not complete, that the utility-function it's running is not the be-all-end-all" - you don't want it to run for U.S. president

  • @kght222
    @kght222 6 лет назад +6

    13:48 a general ai would pretty quickly in adult phase realize that you can shut them down and change them, trying to keep them from knowing it would be counter productive.

  • @frantisekvrana3902
    @frantisekvrana3902 3 года назад +14

    I would make the program give points as follows:
    Tea in front of programmer: 10points, shutdown
    Button pressed by programmer: 10points, shutdown
    Button pressed by other: 9points, shutdown
    Object not allowed to damage damaged, -2 points (no shutdown)
    So the robot should try to get tea or get the button pressed by programmer. But it is not allowed to damage most objects (including the programmer). It should even prefer to shut itself down, than damage anything it is not allowed to.
    Edit: And there is at least one issue with it. Either it considers any damage, in which case it will shut itself off, or it only considers damage done by itself, in which case it will be fine with tricking others into doing actions it should not. I realised this when watching the video further.

    • @MayanScientist
      @MayanScientist 2 года назад +6

      Would be incentivized to cause enough of a ruckus that the programmer wants to press the button, just as much as making tea. Like he says in the video, it might just "take a swipe at you" or similar. Even if not breaking something, it might make a really annoying noise or cause enough fear and pain in the world that you press the button.

  • @UMosNyu
    @UMosNyu 8 лет назад +100

    Did he say "it will volkswagen you" at 7:30? Or did I misshear?

    • @karialatalo2447
      @karialatalo2447 8 лет назад +30

      It's about those emissions.

    • @ZettaFan
      @ZettaFan 8 лет назад +82

      Volkswagen was busted for making their car emissions do well on tests and poor out on the road. In this example that means the AI will perform well on the test phase and once you are not paying attention or unable to stop it, it will behave poorly.

    • @darioinfini
      @darioinfini 7 лет назад +9

      Thanks for that clarification. I wasn't sure if I heard that right, and then I wasn't sure what he was referring to LOL.

    • @namelastname4077
      @namelastname4077 7 лет назад +4

      are you half deaf or are you just trying to Volkswagon me?

    • @Volvith
      @Volvith 7 лет назад +2

      Nope, that's accurate and the best saying ever... xD

  • @charlesc6011
    @charlesc6011 7 лет назад +13

    The stop button problem should be the first problem AI solves.

    • @charlesc6011
      @charlesc6011 7 лет назад +5

      Before it gets too smart.

    • @jsd4574
      @jsd4574 7 лет назад +5

      charles curling But how do we know that it hasn't just lied to you about the design so that it can interact with it at a later date, as described in the video

    • @tiagodarkpeasant
      @tiagodarkpeasant 7 лет назад +2

      because right now the ai has no idea it will be able to do anything besides fixing the button problem

  • @garretmkiii
    @garretmkiii 6 лет назад +23

    "You haven't proved it's safe, you've just proved that you can't figure out how it's dangerous."

  • @circuitbreaker08
    @circuitbreaker08 5 лет назад

    +100 points for making tea, 0 for a button press, -100 for fighting you.

  • @TinyFoxTom
    @TinyFoxTom 5 лет назад +43

    It would probably be easier to let an AI learn from its mistakes than to strictly forbid anything. Give it a "childhood" in a virtual environment.

    • @hawoaliahmed6996
      @hawoaliahmed6996 5 лет назад +22

      Is not a child !!!
      It doesnt know what mistakes are
      It doesnt care about consequences if you dont make it do that!!
      Please dont

    • @SimonBuchanNz
      @SimonBuchanNz 5 лет назад +25

      The main problem with this is that it devolves into the "safety test" case described in this video, and the AI is incentivised to lie in order to pass the test/"childhood" so it can get to trampling the real babies in order to get you tea faster.

    • @klaasbernd
      @klaasbernd 5 лет назад +4

      @DefinitelyExisting Some behaviour is however preprogrammed. Morality in the broad form in preprogrammed from birth. The specific form is flexible, but the concept is not.

    • @dash445566
      @dash445566 5 лет назад

      West world

    • @ribbitgoesthedoglastnamehe4681
      @ribbitgoesthedoglastnamehe4681 5 лет назад +2

      @@klaasbernd Morality in broad form is preprogrammed, but still requires learning. You can also unlearn morals, and you can prioritise your morals.
      Most common example is: I want this, so I should have it. You are trying to stop me from having it so you are a bad person. Theoretically we are equal, but bad people who try to hurt other people dont have the right to do that. By having this, you hurt me, because not having it makes me sad, thefore you do not deserve it while I do. Since you are trying to hurt me, I have equal or greater right to hurt you.
      Despite the preprogrammed morals, a robot could beat you up in a nanosecond for a bag of tea.

  • @ericsmith116
    @ericsmith116 5 лет назад +3

    i discovered this channel years ago and appreciated the genius behind the thinking. Now that im starting CS classes at my school i appreciate the coding it takes to make something like this SOOOOO much more.

  • @WouterWeggelaar
    @WouterWeggelaar 8 лет назад +8

    well worth the extended watch! very clear. I love how the current solutions all have problems, just like humans! I think there is no solution to this problem other than doing the same thing that humans do: parenting and school.

    • @dexter9313
      @dexter9313 8 лет назад +3

      Yes actually if we can solve this problem we can solve human crime. I don't think we will ever solve this kind of problem.

    • @massimookissed1023
      @massimookissed1023 8 лет назад +1

      Wouter Weggelaar , great(!)
      Then we end up with a sociopathic teenage emo robot who resents humans because of its parents and getting bullied at school by all the flesh kids.

    • @WouterWeggelaar
      @WouterWeggelaar 8 лет назад +2

      Alex Delarge what else can I do?

    • @EvenTheDogAgrees
      @EvenTheDogAgrees 8 лет назад +1

      Wouter Weggelaar: many things.

    • @WouterWeggelaar
      @WouterWeggelaar 8 лет назад

      Juan Rial I meant, I can't speak for anyone else, but well played :-)

  • @bobjonson143
    @bobjonson143 3 месяца назад

    I was going through my watch later list and I found this video. It was very interesting to listen to this video with the thought in the back of my mind that Chat GPT 4 exists now but didn't when this video was written and filmed.

  • @erickweil4580
    @erickweil4580 8 лет назад +23

    i think the robot AI should be composed of two AI. one is the one that control everything, and another does a 'goal maping' of the robot, like a antivirus, runing parallel checking if what the robot want to do is in any way harmful. if it is that second system shutdown automatically the robot.

    • @Nirhuman
      @Nirhuman 8 лет назад +10

      how do you set the utility function of this second system? its the same problem as with one ai only :)

    • @DagarCoH
      @DagarCoH 8 лет назад +24

      Likely the control AI would immediately shut down the executive AI, because any action might somehow cause harm or destruction. Also, the executive AI might find that the control AI is preventing it from fulfilling its goals efficiently and try to shut it down or minimize its influence. And lastly you have the problem of defining what exactly is harmful or destructive to the control AI, which is about as easy as implementing Asimovs laws of robotics - that means nearly impossible.

    • @mduckernz
      @mduckernz 8 лет назад +1

      erick weil Then you've got the problem of defining "harmful" - which is ultimately the same problem.
      However, I do agree in general with the idea of using adversarial architectures, where different goals must be balanced

    • @SethPentolope
      @SethPentolope 8 лет назад +2

      Sorry for not posting it here, but I may have a possible solution that uses two different ai, I have a different comment posted on this video explaining it

    • @sacredgeometry
      @sacredgeometry 8 лет назад +5

      why not three, an id, ego and super ego

  • @luukh5229
    @luukh5229 5 лет назад +24

    This is the logic that explains the movie 'ex machina'

  • @-YELDAH
    @-YELDAH 6 лет назад +11

    what if it was aiming to help you create you're version of it? so it wants to fail if it can, so you can help it be perfect? (so it only wants you to press the button when it knows you've thought of something to improve it in your way, as that's what it wants)
    also it might try to harvest your brain to speed up the process

  • @69k_gold
    @69k_gold 2 года назад +1

    Me, an intellectual: *Makes it so that the button just disconnects the bot from the power supply*

  • @unnilnonium
    @unnilnonium 5 лет назад +20

    Arthur Dent has already explored all the consequences of asking an AI to make you a cup of tea.

    • @alejotassile6441
      @alejotassile6441 3 года назад +3

      Crushes your baby, grabs the tea, forces you to drink it, and tears your arm and shuts itself down with the fingerprint recognition for double reward

  • @lulairenoroub3869
    @lulairenoroub3869 Год назад +3

    I know this is plain english, but, shouldn't it be doable to remove rewards dynamically? Like, if the robot is capable of seeing you as about to push the button, you could code it to understand that the tea reward has already expired, and at that point, the reward for hitting the button is higher, so it lets you.

    • @Silas_MN
      @Silas_MN Год назад

      that's a neat take! it does run into the sub-agent stability problem that Rob mentions in this video

  • @IJustLoveStories
    @IJustLoveStories 6 лет назад +7

    You'd almost need a second moral system that subtracts points every time the robot behaves undesirably or amorally.
    For example, does the robot hurt a human on the execution of its task? Subtract points. Does the robot try to change the command rather than execute it? Subtract points. I guess kinda like a shock collar.
    Of course, far easier said than done.

    • @aguyontheinternet8436
      @aguyontheinternet8436 2 года назад +2

      That would be the patch spaghetti code mentioned at 15:14. You're never going to be able to think of every single thing the robot could do to get you to press/don't press the button. It IS smarter than you, and it _will_ continue to outsmart you until it becomes easier to just make a bot that makes tea by yourself instead of training up an ultra-smart AI to do it for you.

  • @tciddados
    @tciddados 11 месяцев назад +1

    The other bit about not telling the robot about the button, even if it was a benign AI that wasn't deceiving you and didn't know about the button, is that if it ever cloned/duplicated itself, it would never attempt to make another body with the button, because it doesn't know about it. So even if the original robot was safe, the robots it creates wouldn't be.

  • @niclashallgren9922
    @niclashallgren9922 7 лет назад +30

    How about creating two "identical" buttons, one shuts it down and the other gives it a reward equal or greater to the main goal? It does not dare to press any button in case that it shuts it down, on the other hand it wants to keep the buttons since it can give it a good reward. The human controlling it does not know which button i which, so it will not try to deceive you. If you want to shut it down, you simply press both. Does this work?

    • @iantomasik2
      @iantomasik2 7 лет назад +22

      It still won't let you press the buttons, since there is a chance that you shut it down and it won't be able to complete the task and get reward.

    • @Lithobrake0
      @Lithobrake0 7 лет назад +26

      Niclas Hallgren well if the human presses both at the same time, and the ai always gets the points from the bonus button, it will try to get you to press them.
      If you press them one at a time, it is a situation with hidden information, which involves probability. Since any general ai that isn't omniscient (doesn't know everything) has to have a way to deal with probability, it will use it to decide wether it prefers the buttons to be pressed or not and act accordingly, so we just run back into the same problem.

    • @tetraedri_1834
      @tetraedri_1834 7 лет назад +2

      One word: wolksvagen

    • @alexjacoli6176
      @alexjacoli6176 7 лет назад

      Niclas Hallgren robots dont need rewards.

    • @magicmulder
      @magicmulder 7 лет назад +3

      If the reward for the 2nd button is greater than 2x the reward for the main goal, the average reward for a random button press is greater than the main goal, thus the robot will gamble and press a button. Or make the human press it.
      If the reward for the 2nd button is smaller than 2x the main goal reward, then pressing any button carries a lower probability score than the main goal and the robot will try anything to prevent anyone from pressing any button.
      So you end up with the same dilemma, either the robot will do anything to have a button pressed or anything to prevent it.

  • @blackheart2728
    @blackheart2728 5 лет назад +12

    So, what you're saying is to give it two utility functions:
    1) Help me redesign you such that I never want to press the button
    2) Make a cup of tea

    • @anthonynorman7545
      @anthonynorman7545 4 года назад +7

      Following 1 seems like it would result in lying during tests and thus avoid the work of redesigning

  • @McMurchie
    @McMurchie 4 года назад +12

    This is why in most modern Sci-Fi's the 'off' switch is hidden from the AI's knowledge, (not included in the AI's design schematics...etc etc).

    • @dsdy1205
      @dsdy1205 2 года назад +1

      Go look at his channel, this also doesn't work. I will also point out a statistically improbably number of those movies revolve around the AI discovering the off switch and then going on to break free anyway

    • @non-hyphenated
      @non-hyphenated 2 года назад +1

      @@dsdy1205 Perhaps you could hard-code that the discovery of the off switch shuts it down.

    • @dsdy1205
      @dsdy1205 2 года назад

      @@non-hyphenated you're talking about an AI that can recursively improve itself. There is no such thing as hard-coding self-modifying code.

    • @official-obama
      @official-obama Год назад

      @@dsdy1205 well, just don't let it modify the code.

    • @dsdy1205
      @dsdy1205 Год назад

      @@official-obama How are you going to stop it? The code is inside its brain, and it is smart enough to crack any locks you put on it

  • @Salisbury2015
    @Salisbury2015 2 года назад +1

    Really fascinating exploration on this topic. I'm a layperson so I'm sure there's a simple answer, but I can't help but to ask one question :Why can't you solve this problem by incentivisizing the AI:s goals to be aligned with our own? That is, treating it like a child that needs to learn its place in the world and how to be a productive member of it. You can't turn off a child, so why treat a human level intelligence (as hypothesized in the video) any differently. If the AI is programmed to value livubg things, and share general human values, the issue of having an on/off button is moot. The AI doesn't kill the baby, because it understands and is programmed to share our aversion to that outcome. And as a side benefit it would resist any orders by a humsn to violate those principles.

    • @ganondorfchampin
      @ganondorfchampin 2 года назад +3

      I think that's the goal, but it's easier said than done.

  • @Akrub1979
    @Akrub1979 7 лет назад +5

    How about this:
    Reward for completing task: 1 point
    Reward for master HAVING ACCESS to button all time until task completed: 2 points
    Button pressed: 0 points (but still awarded the 2 points from previous line)
    Would this work?

    • @agiar2000
      @agiar2000 7 лет назад

      I'm no expert, but I like it.

    • @magicmulder
      @magicmulder 7 лет назад +5

      Depends on how you define "having access". You don't want the robot to forcefully drag you with him, do you?

    • @Himitsu_Chan
      @Himitsu_Chan 6 лет назад

      I would add something that kind of means: Not allowing master to press button -10 points.

    • @magicmulder
      @magicmulder 6 лет назад +1

      That would mean the robot wouldn't make you tea but keep moving in front of you to keep allowing you to press the button. ;)

  • @millanferende6723
    @millanferende6723 4 года назад +3

    Interesting conclusion... it's like we need to find a way to work WITH and communicate with a robot, rather than to threaten it with a shutdown. This way it will actually help us to improve itself in order to live side by side. That existentially is quite impressive.

    • @biolinkstudios
      @biolinkstudios 2 года назад +1

      You mean just like humans

    • @dmitripogosian5084
      @dmitripogosian5084 Год назад

      You can look at how well that works with humans. And you find that one needs to keep militaries and police around

  • @gabrieltheron3928
    @gabrieltheron3928 5 лет назад +6

    How about giving the highest rewards for honesty?

    • @The76Malibu
      @The76Malibu 5 лет назад

      Then if it is easier to get the objective and fool you into thinking it is honest, it will do that.

    • @tacitozetticci9308
      @tacitozetticci9308 4 года назад

      It's like saying "put a reward to let it assist you while you're programming" I mean, it's very hard to define and it's literally the problem. How to you translate that in simpler smaller pieces of logic that are not ambiguous?

  • @johnnybosman1402
    @johnnybosman1402 4 года назад +1

    Was waiting on the "robot kidnaps baby PRESS THE DAMN BUTTON BEEP BOOP" scenario..

  • @MrDim1800
    @MrDim1800 5 лет назад +5

    How about having two different stop buttons, one for "oh, it made a mistake" and one of "you're not pulling a Skynet on me"?

  • @cern1999sb
    @cern1999sb 6 лет назад +5

    Why don't you give the AI a massive number of negative points for trying to stop someone from pressing the button? The negative points would have to outweigh the number of points for completing their task. This means that they wouldn't try to stop someone turning them off in order to finish the task because they'd end up with less points overall, and they wouldn't try to get someone to press the button, and they wouldn't press it themselves, because they wouldn't have any incentive to do so.

    • @keepinmahprivacy9754
      @keepinmahprivacy9754 6 лет назад +2

      They would still have an incentive to push the button themselves because the AI will be calculating opportunity cost. So if it takes 10 points of effort/work to get the tea for a reward of 100 points, and 1 point of effort to push the button for a reward of 100 points, it will always prefer to push the button. Since 100-10=90 versus 100-1=99, there is essentially a higher reward for achieving the same amount of points with less work when you factor in opportunity cost.
      If you try to make the AI so that it won't factor in opportunity cost, then it will simply be a very inefficient AI, since it will have no way of deciding the best way to get the tea. So that's not an option.

    • @orphiccoma
      @orphiccoma 6 лет назад +1

      I'm not suggesting that this is a correct solution, but I believe he means to weight the button being pressed with no reward, but to weight explicit attempts to prevent the button from being pressed extremely negatively. Thus, there is no incentive to press the button, but nor will it try to stop you.
      I'm not convinced that this doesn't just push the problem back a level though - it incentivizes the robot not to be a situation where you may want to press the button. Maybe it will still do things you don't want to optimize tea-fetching, it just won't let you know about them.

  • @linktheheroofhyrule2498
    @linktheheroofhyrule2498 4 года назад +3

    It's 4 in the morning and I'm watching Agent Kallus talk to me about why a big red shutdown button won't work all the time

  • @cinemaipswich4636
    @cinemaipswich4636 2 года назад +1

    The song "Daisy" sung by HAL 9000 was the first mechanical recording that Thomas Edison on his "Gramophone". Perhaps not the first "memory device" but it was included in 2001 as a metaphor. That 1st cylindrical wax tube still exists.