AI That Doesn't Try Too Hard - Maximizers and Satisficers

Поделиться
HTML-код
  • Опубликовано: 9 июн 2024
  • Powerful AI systems can be dangerous in part because they pursue their goals as strongly as they can. Perhaps it would be safer to have systems that don't aim for perfection, and stop at 'good enough'. How could we build something like that?
    Generating Fake RUclips comments with GPT-2: • Generating Fake YouTub...
    Computerphile Videos:
    Unicorn AI: • Unicorn AI - Computerp...
    More GPT-2, the 'writer' of Unicorn AI: • More GPT-2, the 'write...
    AI Language Models & Transformers: • AI Language Models & T...
    GPT-2: Why Didn't They Release It?: • GPT-2: Why Didn't They...
    The Deadly Truth of General AI?: • Deadly Truth of Genera...
    With thanks to my excellent Patreon supporters:
    / robertskmiles
    Scott Worley
    Jordan Medina
    Simon Strandgaard
    JJ Hepboin
    Lupuleasa Ionuț
    Pedro A Ortega
    Said Polat
    Chris Canal
    Nicholas Kees Dupuis
    Jake Ehrlich
    Mark Hechim
    Kellen lask
    Francisco Tolmasky
    Michael Andregg
    Alexandru Dobre
    David Reid
    Robert Daniel Pickard
    Peter Rolf
    Chad Jones
    Truthdoc
    James
    Richárd Nagyfi
    Jason Hise
    Phil Moyer
    Shevis Johnson
    Alec Johnson
    Clemens Arbesser
    Ludwig Schubert
    Bryce Daifuku
    Allen Faure
    Eric James
    Jonatan R
    Ingvi Gautsson
    Michael Greve
    Julius Brash
    Tom O'Connor
    Erik de Bruijn
    Robin Green
    Laura Olds
    Jon Halliday
    Paul Hobbs
    Jeroen De Dauw
    Tim Neilson
    Eric Scammell
    Igor Keller
    Ben Glanton
    Robert Sokolowski
    anul kumar sinha
    Jérôme Frossard
    Sean Gibat
    Cooper Lawton
    Tyler Herrmann
    Tomas Sayder
    Ian Munro
    Jérôme Beaulieu
    Taras Bobrovytsky
    Anne Buit
    Tom Murphy
    Vaskó Richárd
    Sebastian Birjoveanu
    Gladamas
    Sylvain Chevalier
    DGJono
    Dmitri Afanasjev
    Brian Sandberg
    Marcel Ward
    Andrew Weir
    Ben Archer
    Scott McCarthy
    Kabs
    Miłosz Wierzbicki
    Tendayi Mawushe
    Jannik Olbrich
    Anne Kohlbrenner
    Jussi Männistö
    Mr Fantastic
    Wr4thon
    Martin Ottosen
    Archy de Berker
    Marc Pauly
    Joshua Pratt
    Andy Kobre
    Brian Gillespie
    Martin Wind
    Peggy Youell
    Poker Chen
    Kees
    Darko Sperac
    Truls
    Paul Moffat
    Anders Öhrt
    Marco Tiraboschi
    Michael Kuhinica
    Fraser Cain
    Robin Scharf
    Oren Milman
    John Rees
    Seth Brothwell
    Clark Mitchell
    Kasper Schnack
    Michael Hunter
    Klemen Slavic
    Patrick Henderson
    Long Nguyen
    Melisa Kostrzewski
    Hendrik
    Daniel Munter
    Graham Henry
    Volotat
    Duncan Orr
    Marin Aldimirov
    Bryan Egan
    James Fowkes
    Frame Problems
    Alan Bandurka
    Benjamin Hull
    Tatiana Ponomareva
    Aleksi Maunu
    Michael Bates
    Simon Pilkington
    Dion Gerald Bridger
    Steven Cope
    Marcos Alfredo Núñez
    Petr Smital
    Daniel Kokotajlo
    Fionn
    Yuchong Li
    Nathan Fish
    Diagon
    Parker Lund
    Russell schoen
    Andreas Blomqvist
    Bertalan Bodor
    David Morgan
    Ben Schultz
    Zannheim
    Daniel Eickhardt
    lyon549
    HD
    / robertskmiles
  • НаукаНаука

Комментарии • 1,2 тыс.

  • @mihalisboulasikis5911
    @mihalisboulasikis5911 4 года назад +1537

    "Intuitively the issue is that utility maximizers have precisely zero chill". Best intuitive explanation on the subject ever.

    • @tonicblue
      @tonicblue 4 года назад +47

      I think this quotation is precisely why I love this guy.

    • @mihalisboulasikis5911
      @mihalisboulasikis5911 4 года назад +41

      @@tonicblue Exactly. These types of explanations (which are not "formal" but do a much better job at conveying a point - especially to non-experts - than formal explanations) make you realize that not only is he a brilliant scientist, but also has intuition and experience on the subject which in my opinion is also extremely important. And of course, the humor is on point, as always!

    • @tonicblue
      @tonicblue 4 года назад +2

      @@mihalisboulasikis5911 couldn't agree more

    • @Gooberpatrol66
      @Gooberpatrol66 4 года назад +1

      So if I have zero chill does that make me hyperintelligent?

    • @NortheastGamer
      @NortheastGamer 4 года назад +25

      @@Gooberpatrol66 Maximizers aren't necessarily intelligent, they just treat everything like it's life or death. (Which is actually how we train most maximizers, by killing off the weak)

  • @armorsmith43
    @armorsmith43 4 года назад +699

    “So satisficers will want to become maximizers” and this is one reason that studying AI safety is interesting-it prompts observations that also apply to organizations made of humans.

    • @PragmaticAntithesis
      @PragmaticAntithesis 4 года назад +130

      The unintended social commentary about capitalism is real...

    • @killers31337
      @killers31337 4 года назад +152

      Well, AI is simply a kind of agent making decisions, so all the theory about such agents still applies.
      Say, perverse incentive problem. E.g. if you pay people for rat tails hoping they will catch wild rats, they might end up farming rats.-- this is a 'maximizer' problem which actually happened IRL.

    • @PragmaticAntithesis
      @PragmaticAntithesis 4 года назад +6

      @@killers31337 I thought that was a culling if stray cats, not rats?

    • @ivandiaz5791
      @ivandiaz5791 4 года назад +67

      @@PragmaticAntithesis It has happened many times in many different places for all sorts of animal problems. The most famous case generally was snakes in India under British rule... specifically cobras, which is why this is often called the Cobra Effect. See the wikipedia article.

    • @bp56789
      @bp56789 4 года назад +49

      You think humans don't seek to maximise their own utility if they aren't in a "capitalist" system?

  • @tatianatub
    @tatianatub 4 года назад +638

    "utility maximizers have precisely zero chill" needs to be on a tshirt

    • @SlimThrull
      @SlimThrull 4 года назад +10

      Yes. Yes, it does.

    • @Gunth0r
      @Gunth0r 4 года назад +20

      I would buy Robert Miles merch.

    • @xcvsdxvsx
      @xcvsdxvsx 4 года назад +6

      @@Gunth0r This channel would have the best merch ever.

    • @nibblrrr7124
      @nibblrrr7124 4 года назад

      Well, what if you're a maximizer that values "chill" (amongst other things, or exclusively)? :^)

    • @josephburchanowski4636
      @josephburchanowski4636 4 года назад +6

      @@nibblrrr7124 Intuitively the issue will be that utility maximizers will have precisely zero chill when it comes to maximizing chill.
      Also how do you code chill?

  • @miapuffia
    @miapuffia 4 года назад +332

    Satisficer AI may want to use a maximizer AI, as that will lead to a high probably of success, even without knowing how the maximizer works. That made me think that humans are satisficers and we're using AI as maximizers, in a similar way

    • @ciherrera
      @ciherrera 4 года назад +24

      Yup, but unfortunately (or maybe fortunately) we don't have a convenient way to reach into our source code and turn ourselves into maximizers, so we have to create one from scratch

    • @AugustusBohn0
      @AugustusBohn0 4 года назад +1

      @@ciherrera inducing certain mental conditions would accomplish this as well as can be expected for biological creatures.

    • @johnwilford3020
      @johnwilford3020 4 года назад +9

      This is deep

    • @JM-mh1pp
      @JM-mh1pp 3 года назад +15

      @@ciherrera I do not want to be maximiser, it goes against my goal of chilling.

    • @randomnobody660
      @randomnobody660 3 года назад +43

      @@JM-mh1pp but do you get MAXIMAL CHILLING!?

  • @unvergebeneid
    @unvergebeneid 4 года назад +521

    "Any world where humans are alive and happy is a world that could have more stamps in it." 😂 😂 😂 I need that on a t-shirt!

    • @diphyllum8180
      @diphyllum8180 4 года назад +4

      but if they're unhappy you made too many stamps

    • @MouseGoat
      @MouseGoat 4 года назад +36

      @@diphyllum8180 the robot begins to inject dopamine into humans to insure they always happy XD

    • @logangraham2956
      @logangraham2956 4 года назад +6

      idk , sounds like something graystillplays would say XD

    • @ioncasu9825
      @ioncasu9825 Год назад

      Killing all humans to make stamps is a bad strategy because after that you don't get more stamps.

  • @theshaggiest303
    @theshaggiest303 4 года назад +283

    "Not trying too hard"? Move over, dude, I happen to be an expert in this field.
    Just program the AI to take a break after every five minutes of work to watch RUclips videos for an hour and a half. Problem solved.

    • @thesteaksaignant
      @thesteaksaignant 4 года назад +69

      5min later...
      Breaking news ! All youtube servers worldwide are down ! Largest DDOS attack ever !

    • @k_tess
      @k_tess 4 года назад +20

      @@thesteaksaignant Now, now this only happens if you multi-thread.

    • @thesteaksaignant
      @thesteaksaignant 4 года назад +34

      @@k_tess let's cross our fingers hoping that a super intelligence capable of conquering the world won't figure out multithreading then

    • @Hakou4Life
      @Hakou4Life 4 года назад +2

      I think it is enough to let it watch youtube...

    • @martinsmouter9321
      @martinsmouter9321 4 года назад +14

      @@thesteaksaignant DDOSing RUclips keeps from watching said video and, so from getting perfect utility.

  • @Verrisin
    @Verrisin 4 года назад +402

    I just realized... If you make it (say AI-1) to want to chill (not work too hard to achieve it)... it will just make something else (another AI) to do the work for it, if it's easier than solving it on its own... right? Then, what it will create is probably a maximizer (because that is the easiest; and it is lazy, and just wants to chill)
    Then I realized..... *We, humans, are the AI-1* ... O.O
    - We are doomed...

    • @buttonasas
      @buttonasas 4 года назад +34

      Amazing observation! But hey, maybe we can build something that is just ever so slifhtly less lazy? Then maybe it can make an another less lazy machine... But yeah, chances are that might suddenly jump to building a maximizer and that's the end :D

    • @shadiester
      @shadiester 4 года назад +12

      Holy crap, that's actually so true!

    • @jjkthebest
      @jjkthebest 4 года назад +14

      Unless that AI cares about self preservation. Normally this would naturally arise from being a utility maximiser, though I'm not sure if it would still be the case for the AI that wants to chill, since it can be confident in the fact that the maximiser it creates will do the job just fine... hmm.

    • @Roonasaur
      @Roonasaur 4 года назад +6

      No. Utility =/= Work. If an AI is successfully programmed to not want infinity stamps, it will not do anything to create infinite stamps. It will only willingly create subordinates that also want less than infinity stamps, and will put in a lot of work to act against any subordinate that is a "maximizer" which will create infinity stamps.
      When Guy-who-needs-a-haircut says he wants AI to "chill" . . . What he's really wanting is for it to look for "balance." And, expert I am not, but that doesn't seem like an impossible thing to code.

    • @Verrisin
      @Verrisin 4 года назад +14

      @@Roonasaur But that is not what it wants. It wants "at least N" - and infinity is good way to assure it will get at least that much. It has nothing against infinite amount of stamps.
      - But I am already thinking about why this isn't as bad as I feared originally: Especially: I think it's not necessary (or even that likely) for a satisficer to become a maximizer. The rest of my 'argument' seems sound to me, but this just does not _feel_ right... I haven't had time to think about it properly, but I think there is something there....
      What he really wants does not matter. Only the utility function he can specify for the AI.

  • @superjugy
    @superjugy 4 года назад +185

    hahahaha, flower smelling champion. I had already seen that comic but its so much more funny in this context XD thanks for the great videos

    • @MouseGoat
      @MouseGoat 4 года назад +2

      Sooo we really do want to program lazynes into our robots :D lmao

  • @NancyLebovitz
    @NancyLebovitz 4 года назад +62

    For anyone who missed it, the closing music is "Dayenu", a Hebrew song with a refrain of "it would have been enough". It's a nice choice.

  • @SapkaliAkif
    @SapkaliAkif 4 года назад +98

    2:57 "You can't perfectly simulate a universe from the inside." is a good motto to have if don't want to overthink stuff. Science is cool

    • @orangeninjaMR
      @orangeninjaMR 4 года назад +1

      This is actually false. It depends entirely on the complexity of the system relative to its size: a large but simple system can have its information "compressed" into a replica within itself, and indeed the fact that real-world physics is at all effective is a result of the fact that some (if not all) of the systems in our universe are compressible in this way. A fun example in the very simple universe of Conway's Game of Life: ruclips.net/video/xP5-iIeKXE8/видео.html

    • @SapkaliAkif
      @SapkaliAkif 4 года назад +6

      @@orangeninjaMR I am no expert, but this seems to ignore something. You can get results this way -if you are looking for results- but you cannot perfectly simulate and observe all the details. So is it really a perfect simulation or is it just a miniature version that gives you the info that you want?

    • @orangeninjaMR
      @orangeninjaMR 4 года назад +1

      @@SapkaliAkif you ask for a perfect simulation, which I would take to mean a "copy containing all of the same information", which demands nothing about observation... but on the other hand if all an AI wants is to predict the utility of the outcome, it doesn't need to be able to observe all of the details, just the number of stamps that it results in!

    • @SapkaliAkif
      @SapkaliAkif 4 года назад

      @@orangeninjaMR Oh I forgot we were in the comments of a AI video.

    • @CircuitrinosOfficial
      @CircuitrinosOfficial 4 года назад +5

      @@orangeninjaMR Doesn't the halting problem disprove the ability to perfectly simulate a universe from the inside?
      For the simulation to perfectly simulate the universe, it also needs to include itself in the simulation because it is a part of the universe. Because of this, it is possible to have situations where the act of the simulator printing out it's answer of the simulation can change the result of the simulation.
      For example:
      Let's say you ask the simulator if your friend is going to invite you to their party.
      If the simulator says yes, you start acting differently towards your friend and end up annoying them. So they decide not to invite you to the party after all. So the simulator was wrong.
      If the simulator says no, you act normal so your friend does invite you to their party. So the simulator was wrong.
      In this situation, the only way for the simulator to accurately simulate the situation is to not tell you the answer.
      But if you designed the simulator to always print out an answer then it can never correctly simulate this situation.

  • @Cobra6x6
    @Cobra6x6 4 года назад +214

    Have you guys played the game Uniserval Paperclips? It's free, and basically you play as the Stamp Collector AI. You're maximizing the number of clips. I kinda loved it to be honest.

    • @Trophonix
      @Trophonix 4 года назад +13

      I also thought of this while watching! Make everything paperclips!!!

    • @zac9311
      @zac9311 4 года назад +3

      That sounds awsome. Is it good?

    • @Trophonix
      @Trophonix 4 года назад +15

      @@zac9311 It's an incremental/clicker game with multiple stages of progression. Google it!

    • @klobiforpresident2254
      @klobiforpresident2254 4 года назад +39

      So what you're saying is that if I want stamps I must invent and subsequently RELEASE THE HYPNO DRONES?

    • @maoman4855
      @maoman4855 4 года назад +12

      @@Trophonix i.e. it's cookie clicker but with paperclips instead of cookies

  • @NightmareCrab
    @NightmareCrab 4 года назад +40

    "Can you relax mister maniacal, soulless, non-living, breathless, pulseless, non-human all-seeing AI, sir? Just chill, don't be such a robot."

    • @baranxlr
      @baranxlr 4 года назад +11

      "SHUT UP AND RETURN TO THE STAMP MINES, MEATBAG"

  • @herp_derpingson
    @herp_derpingson 4 года назад +109

    Historically speaking, several humans have brought apocalypses while they were trying to maximize something.

    • @qwertyTRiG
      @qwertyTRiG 4 года назад +4

      Thomas Midgely Jr, for example.

    • @SamuelKristopher
      @SamuelKristopher 4 года назад +22

      We're doing it right now on several fronts

    • @Nosirrbro
      @Nosirrbro 4 года назад +2

      @@qwertyTRiG Well, that and his pope infestation

    • @DiThi
      @DiThi 4 года назад +29

      I was thinking exactly that: Analyzing corporations as if they were AI agents, they're literally doing everything described in this channel. It's not that corporations are bad. The system itself (capitalism) creates agents that modify their own source code (laws) to maximize capital accumulation.

    • @jordanrodrigues1279
      @jordanrodrigues1279 4 года назад +23

      @@DiThi
      I'm really starting to believe that AI safety research is the most mathematically rigorous critique of utilitarianism and capitalism to date.
      I think I'm okay with that.

  • @MrBrew0
    @MrBrew0 4 года назад +28

    Hello Robert!
    Let me start by saying, your channel is probably my favorite channel on RUclips. I'm a compsci student, AI enthusiast, and your insight and explanations in the field of AI are really entertaining and educational. Many other channels try to present the information in the condensed and easy to digest way, which is fine, but I would really like to see more advanced content on YT. Maybe you have a recommendation for me?
    I was wondering, you don't upload videos very frequently. I really appreciate your work and would be very happy to see more content from you, but if it is because you are busy or want to provide quality over quantity I'm all for it too!

  • @WhiteThunder121
    @WhiteThunder121 4 года назад +81

    7:53:
    "- Control human infrastructure
    - ???
    - STAMPS "
    lol

    • @davidwuhrer6704
      @davidwuhrer6704 4 года назад +7

      Replace stamps with money, and watch the world burn.

    • @revimfadli4666
      @revimfadli4666 4 года назад +1

      @@davidwuhrer6704 especially if it adapts to any new currency made to solve the problem

  • @qzbnyv
    @qzbnyv 4 года назад +20

    Reminds me a lot of asymmetric call option payoffs from finance. And a lot of near-bankrutpcy decision making for corporations.

  • @khananiel-joshuashimunov4561
    @khananiel-joshuashimunov4561 4 года назад +35

    Sounds like you need a cost function that outgrows the utility function at some point as a sort of sanity check.

    • @NineSun001
      @NineSun001 4 года назад +3

      With a human hurt being really costly and a human killed with maximum cost. That would actually solve a lot of the issues. I am sure some clever mind in the field already thought about that.

    • @nibblrrr7124
      @nibblrrr7124 4 года назад +5

      Cost is already considered in the utility function.

    • @nibblrrr7124
      @nibblrrr7124 4 года назад +24

      ​@@NineSun001 You're basically restating Asimov's (fictional) First Law, and the problems with it have been explored in (adaptions of) his works, and ofc by AI researchers.
      Consider that, even if you could define terms like "hurt" or "kill", humans get hurt or die all the time if left to their own devices, so e.g. putting all of them in a coma with perpetual life-extension will reduce the expected number of human injuries & deaths. So if an agent with your proposed values is capable enough to pull it off, it will prefer that to any course of action we would consider desirable.

    • @khananiel-joshuashimunov4561
      @khananiel-joshuashimunov4561 4 года назад

      @@nibblrrr7124 In the video, the utility function is explicitly the number of stamps.

    • @foundleroy2052
      @foundleroy2052 4 года назад +1

      The costs are Aproegmena and the Agent may safely reprogram itself to be indifferent to Adiaphora; To achieve Eudaimonia.
      Marcus AIrelius

  • @bejoscha
    @bejoscha 4 года назад +6

    This is one of the better videos (of all your good ones). I like it very much. Speed is well adjusted (a tiny bit slower than usual), explanations are concise and good. Just a good watch. I'm definitely looking out for the next... Thanks for breaking down such complex topics into digestible chunks for (near)-leasure watching. I feel this is the kind of "solid" common-sense understanding of AI future generations will need to have, even if being an expert in the field is out of reach. More complicated life? Yes, but that's just as it is. People 500 years ago could do with a lot less "every-day complexity" than today as well...

  • @AsteriosChardalias
    @AsteriosChardalias 4 года назад +4

    The content and the comments on this channel always gets me reflecting on the 'human condition' and how much trying to build AIs teaches us about understanding ourselves.

  • @za012345678998765432
    @za012345678998765432 4 года назад +17

    What if you limit both utility and confidence in expected utility approach?
    For example, more than a hundred stamps don't add utility, and more than 99% confidence that it had achieved it's goal isn't worth more utility.
    It probably also fail spactaculerly, but it's interesting to see how

    • @underrated1524
      @underrated1524 4 года назад +7

      "Hmmm. My utility function treats all percentages higher than 99% as exactly 99% for the purpose of expected value. So, my original plan that has a 99.9999% chance of getting 100 stamps isn't gonna cut it, because it leaves almost 1% of the possibility space unused. Ooh, ooh, I got it! I'll give myself a 99% chance to have 100 stamps and a 0.9999% chance to have 99 stamps! Genius!"

    • @serversurfer6169
      @serversurfer6169 4 года назад

      I was thinking something similar. If it has a 99% chance to satisfy the goal, why doesn’t it see how that goes before it starts considering supplemental or compensatory strategies? 🤔

    • @Aconspiracyofravens1
      @Aconspiracyofravens1 Год назад

      a better option would be for it to round percentages
      or: treat options with a less then 5% difference in their likelyhood of succeeding as equal
      in addition, the base model still works, as working against humans has a chance of failure, so a outcome with a 99% certainty is better than one that results in a 99.99999999% likelyhood that has a 2% chance of getting spotted by a investigation algorythim and shut down.

  • @lambdaprog
    @lambdaprog 4 года назад +8

    Add one or more smooth penalty terms to your utility. By smooth, it means that the penalty is a continuous monotonic function of the distance to the safe region with zero when inside the safe region. The penalty terms can be designed to sanction over-optimization (optimizations with little *expected return*), or instability (apocalypse).
    This is a common technique used in non-smooth bounded optimization in capital markets portfolio management where the individual investment per asset within the portfolio is bounded to avoid increasing the portfolio's exposure to market risks.
    I also found similar applications in digital signal processing with adaptive filters that rely on intrinsically bad forecasts (poor statistics) due to the latency constraints (time is the actual resource), available dynamic range of the processing (analog and/or digital) and the power consumption (the thermal stability).
    Looking forward to your next video!

    • @dlwatib
      @dlwatib 4 года назад

      Actually, we usually have a pretty good idea what the safe region is, and if not, we can run the AI in shadow mode to see what it says it would do if set free to do as it pleases.

  • @ZarHakkar
    @ZarHakkar 4 года назад +6

    Issues like these when it comes to practical AI design often make me think of the Great Filter and the likely possibility we're not just quite past it yet.

    • @TiagoTiagoT
      @TiagoTiagoT 4 года назад +5

      But then, where are all the alien robots?

    • @TiagoTiagoT
      @TiagoTiagoT 4 года назад +1

      @@bosstowndynamics5488 But for all the alien robots in the whole galaxy?

    • @TiagoTiagoT
      @TiagoTiagoT 4 года назад

      @@bosstowndynamics5488 But why did all the alien robots of all the zillion of planets in the Milky Way got the same restriction in their programing?

    • @grimjowjaggerjak
      @grimjowjaggerjak 4 года назад +1

      @@TiagoTiagoT imagine in 150 years humans stumble into random stamps planets

    • @underrated1524
      @underrated1524 4 года назад +1

      The issue is that if ASI is the great filter, we immediately run into the same problem all over again. If ASI is the Great Filter, why haven't we yet stumbled across the paperclip maximizer that once was an alien civilization? (Not that I'm complaining, mind you... :) )

  • @Elyandarin
    @Elyandarin 4 года назад +2

    My impression about AI is that you can only ever maximize for one utility function, but you can satisfice as much as you want, as long as you are OK with the failure state of [doing nothing].
    So, you satisfice for "at least 100 stamps expected in optimal case", satisfice for "at least 95% chance of optimal case", satisfice further for "zero human casualties" and "with 99.9% certainty", let the planning engine spin for an hour or until 100 plans have passed muster, then maximize acceptable plans according to something like "simplicity of plan", "positive-sum outcomes" or "similarity to recorded human interactions".
    ...Well, there's probably a lot that could go wrong with that, even so, and I'd probably add some more complex safety measures after considering everything that could go wrong for a couple of months, but that's what I'd start with, were I to program AI.

  • @ViridianIsland
    @ViridianIsland 4 года назад

    Just found your channel, about to start the binge! Thanks for the content!

  • @XOPOIIIO
    @XOPOIIIO 4 года назад +14

    Any utility function will rewrite it's source code to recieve reward from doing nothing and preventing people from rewriting it back.

    • @jameslarsen5057
      @jameslarsen5057 4 года назад +7

      I don't think that's the case. A parent would never take a pill that would make them want to kill their child. Even if they were much happier after the pill, the situation they'd end up in would be contrary to their current goals. In a similar way AIs wouldn't rewrite their utility function, just the code which limits their ability to satisfy their utility function

    • @Grouiiiiik
      @Grouiiiiik 4 года назад +2

      @@jameslarsen5057 what ? People killing relatives and direct ascendants / descendants for money is quite common.

    • @Horny_Fruit_Flies
      @Horny_Fruit_Flies 4 года назад +4

      ХОРОШО
      m Rob already made a video in the past pointing out that agents don't want to modify their utility function.

    • @XOPOIIIO
      @XOPOIIIO 4 года назад

      ​@@jameslarsen5057 I think you right, but I still have something to say. I mean parents don't want to kill their children not only because it is associated with negative reward, but also because it is not right thing to do. I'm not sure would AI have anything close to morality or not. If not, it will achieve the goal not because it is right thing to do, but because it is associated with the reward.

    • @OnEiNsAnEmOtHeRfUcKa
      @OnEiNsAnEmOtHeRfUcKa 4 года назад +4

      @@jameslarsen5057 A parent would never take a pill that would make them _want_ to kill their child. But many have, can, do, and WILL take a pill, substance or psychological hook that makes them neglect their child completely to the point where they eventually either die or are taken out of custody, then continue to obliterate themselves with their new reward function even at the cost of their future, finances, family, mental state and physical body. Some recover. Most don't.

  • @CircusBamse
    @CircusBamse 4 года назад +4

    I absolutely love your outro, I dunno how many people does not know or recognize your parody of "Chroma Key test" xD

  • @gabrote42
    @gabrote42 2 года назад +1

    7:53 This is one of the best missing steps plans I have ever seen

  • @BarnacleBrown
    @BarnacleBrown 4 года назад +1

    This video was great! Hope to see more videos from you, You've done great work on computerphile as well

  • @shadowmil
    @shadowmil 4 года назад +13

    So... what about a bell curve? Get as close to 100 stamps as possible, but as you get more than 100, the score decreases. So getting 1,000,000 would be rated low, even lower than 0 stamps. The goal of making yourself a maximizer would also be rated very poorly.

    • @jamesrockybullin5250
      @jamesrockybullin5250 4 года назад +18

      He addressed that in the video. You don't want the world to be made into stamp-counting machines.

    • @puskajussi37
      @puskajussi37 4 года назад +5

      One common problem seems to be that the utility function never tells the machine what we don't want it to do. You could subtract "the effect the agi has on world" from the utility and (especially if it uderstands concepts as "order of 100 stamps from a factory is normal") could lead to solutions where the stamps arrive at a convenient time to not disturb your day.
      Then again, it would also lead to solutions such as "lets not tell the human he has the stamps, maybe he just forgets about them without fuzz" or "lets perform poorly so this AGI tech doens't get used and disrupt the whole world with its usefullnes."
      Didn't Robert speak about this too, I forget?

    • @pafnutiytheartist
      @pafnutiytheartist 4 года назад +4

      Yes this. And throw in a small penalty for changes in the environment like discussed in the side effects video. Make it so a reasonable strategy has a punishment of 1. And complete world domination results in highly negative values. This way sending an extra email to make sure the stamps arrive on time is ok if it gives you a percent or two more shure but creating a separate agent to count stamps is instantly negative reward.

    • @underrated1524
      @underrated1524 4 года назад +1

      @@puskajussi37 Adding negative terms to an unsafe system doesn't reliably make it safe. We can't depend on being able to match an AGI's ability to spot loopholes in the rules, so there'll unavoidably be loopholes the AGI can see but we can't.

  • @iamatissue
    @iamatissue 4 года назад +8

    Did no one get the shipping forecast joke at 9:24?

  • @elfpi55-bigB0O85
    @elfpi55-bigB0O85 4 года назад +2

    You're absolutely awesome, Miles. Thank you for blessing us with your high quality content

  • @lunkel8108
    @lunkel8108 4 года назад

    Your videos always were awesome but you've really outdone yourself with the presentation on this one, great job

  • @leninalopez2912
    @leninalopez2912 4 года назад +5

    Hello Miles:
    I've been thinking for a while to ask/suggest you to make a video showing us publications regarding AI, either journals, proceedings, or textbooks... for those of us either completely ignorant on the subject, barely initiated in it, or those already knowing the basics and capable of following the last developments on the subject right from the sources.
    I love your videos, your style, and your expositions... but I must say that at the end of EACH video, I'm **HUNGRY** for **A LOT MORE**.
    Thanks!
    Live love and SkyNet... I mean... prosper (?

    • @SamB-gn7fw
      @SamB-gn7fw 4 года назад

      You'd love Robert Miles' weekly podcast where he gives an overview of the latest developments in AI safety: rohinshah.com/alignment-newsletter/

    • @SamB-gn7fw
      @SamB-gn7fw 4 года назад

      You would also like this online AI safety MOOC series: www.aisafety.info/

  • @mydickissmallbut9716
    @mydickissmallbut9716 4 года назад +7

    Maybe you could add a "have a minimum impact on the state of the environment" (or something similar) requirement.

    • @circuit10
      @circuit10 2 года назад

      There was a video on that, there are a few reasons why it doesn’t work, I’ll find it

    • @circuit10
      @circuit10 2 года назад

      ruclips.net/video/lqJUIqZNzP8/видео.html

  • @NextFuckingLevel
    @NextFuckingLevel 3 года назад +1

    I didn't know that "ultron" problem is this much more complicated

  • @remmo123
    @remmo123 4 года назад

    Very clearly explained! I will wait for the next videos in the series.

  • @nraynaud
    @nraynaud 4 года назад +12

    it just occurred to me that Uber killed a pedestrian by trying to maximise the average number of miles between system disconnections.

    • @Abdega
      @Abdega 4 года назад +2

      This… is news to me

  • @willdbeast1523
    @willdbeast1523 4 года назад +27

    To solve the "becoming a maximizer" problem you could have a symmetric utility function somewhat like a probability density function, so any strategy that might result in "a fuckton of stamps" would be actively bad rather than just extraneous (but this wouldn't fix the tendency to go overkill on the certainty side making a billion stamp counters etc)
    edit: I guess you could also use a broken expectation calculation so it would ignore low probability events (like the chance of miscounting 100 times) but that seems a very bad idea from the start

    • @player6769
      @player6769 4 года назад +3

      That's what I was thinking... if going over 100 was just as undesirable as going under, wouldn't that demotivate it from ordering 100 stamps twice, since the expected value would be much more different from 100 than if it only got 99 stamps?

    • @chemical_ko755
      @chemical_ko755 4 года назад +6

      @@player6769 That is the same as the case of U(w) = {100 if s(w) = 100, 0 otherwise}. It could result in a lot of stamp counting infrastructure.

    • @player6769
      @player6769 4 года назад

      @@chemical_ko755 ah, fair enough. Always another problem

    • @lucashowell8689
      @lucashowell8689 Год назад

      You could just tell it to fudge the numbers if they’re close enough and get utility from the laziness it uses to do so

  • @smiley_1000
    @smiley_1000 Год назад +1

    This reminds me of Asimov, in his novels some of the robots start discussing whether they can modify or circumvent the three laws of robotics that they would usually all have to obey.

  • @MegaOgrady
    @MegaOgrady 4 года назад

    I'm so glad that I found this channel
    I'd only watch computerphile cuz of him, and honestly, he does such a great job at simplifying how an AI works so that those who don't really know the in-depths can understand

  • @ruben307
    @ruben307 4 года назад +6

    should make it so the expected stamps should be between 95 to 105 to get the maximum utility function. That way there is no reason to change its code (except for changing what the maximum utility function is)

    • @underrated1524
      @underrated1524 4 года назад +2

      That would indeed solve the problem of self-modification, but this system is functionally identical to the "give me precisely 100 stamps" agent - it'll turn the planet into redundant stamp counting machinery to make absolutely sure the stamp count is within the allowable range.

    • @cakep4271
      @cakep4271 4 года назад +1

      Just make it round up. If it's 95% sure that it will accomplish the desired range, round up so that it thinks it is 100% sure.

    • @underrated1524
      @underrated1524 4 года назад

      @@cakep4271 Then you're right back at a satisficer, since many strategies all lead to the "perfect" solution according to the utility function and there's no specified way to break the tie. And once again you run into the problem that "make a maximizer with the same values as you" might be the fastest solution to identify and implement.

    • @ruben307
      @ruben307 4 года назад

      If it gets full satisfaction by a 95% cjance to get the stamps. It could just order them and say satisfied. Then if they arent there in a week it will order them from somewhere else if the treashold of lost package is above 5%

  • @badradish2116
    @badradish2116 4 года назад +3

    "hi."
    - robert miles, 2019

  • @jonwatte4293
    @jonwatte4293 4 года назад +1

    Also, the "Xenos paradox" of "infinitely ordering another 100 to increase probability" obviously has other solutions. But with a cost function of actions, it will very quickly converge on safe, cheap actions.

  • @CyberAnalyzer
    @CyberAnalyzer 4 года назад

    I appreciate your shared knowledge! Keep the work up!

  • @toyuyn
    @toyuyn 4 года назад +4

    To think shen's comics would make it into an AI safety video

  • @MattettaM
    @MattettaM 4 года назад +5

    I have a question regarding that Utility Satisficers become Maximizers.
    Wouldn't modifying its own goal to get stamps within a certain range into get as many stamps as possible conflict with its own utility function? Or is this issue seperate from that?

    • @underrated1524
      @underrated1524 3 года назад

      Normally, yes, this kind of agent avoids changing its own utility function, but there's a key difference here. Because satisfiers don't have fully defined utility functions, they have no qualms about arbitrarily pinning down those parts of their utility function that are undefined.

  • @kwillo4
    @kwillo4 3 года назад

    Great vid! Last strip on the flowers was fun :)

  • @y2ksw1
    @y2ksw1 4 года назад +1

    The Google search engine uses a relaxed neural network, reason for which it has such a great performance. And yet is pretty reliable, although not perfect.

  • @marin.aldimirov
    @marin.aldimirov 4 года назад +9

    What if the AI can gradually increase the outcome. Like come up with a strategy to collect 1 stamp. Then modify it so it can collect 2 and so on, until it has a strategy for collecting 100, but no more. Then execute only the 100 stamp strategy.

    • @GrixM
      @GrixM 4 года назад +12

      Even the simplest goal such as collecting 1 stamp contains a bunch of strategies resulting in the apocalypse.

    • @puskajussi37
      @puskajussi37 4 года назад +1

      @@GrixM True. But what if the first program is ready made, safe program? Not quite as usefull and sill prone to possibly murderous tactics but its something.

  • @ioncasu1993
    @ioncasu1993 4 года назад +3

    Can we just all agree that building a stamp collector is a bad idea and drop it?

    • @user-xz2rv4wq7g
      @user-xz2rv4wq7g 4 года назад +1

      This is why emails are good, now a spam decreasing AI, that would be good. *AI procceds to destroy every computer with email on the planet*.

    • @jamesmnguyen
      @jamesmnguyen 4 года назад

      @@user-xz2rv4wq7g More like, *AI proceeds to eliminate humans, because humans have a non 0 chance of producing spam emails*

    • @underrated1524
      @underrated1524 4 года назад +1

      Wouldn't that be nice. If you can find a way to get us all to agree on that, please let me know.

  • @JustAZivi
    @JustAZivi 4 года назад +1

    Would be great to see the mentioned "next video" soon. ;-)

  • @urieldaboamorte
    @urieldaboamorte 4 года назад +2

    if my professors had told me economic theory would help watching pop AI videos with ease I wouldn't have cried to sleep so much in the past semesters

  • @sevret313
    @sevret313 4 года назад +7

    What about two bounds?
    One for the utility function and another for the expected value?
    So if you bound the expected value to 100 and the utility to 150, then ordering 150 stamps might give you an expected value of 147 stamps. But you bound this to 100.
    So if you've a 50:50 between 0 stamps and 1 trillion stamps, under this bounds it will get an expected value at 75, less than just ordering 150 stamps.

    • @_DarkEmperor
      @_DarkEmperor 4 года назад

      Realistic Stamp Collecting AI, would get limited resources. So, AI, i give You 1000 000 $ and get me as much stamps as You can get in 2 years.

    • @sevret313
      @sevret313 4 года назад +4

      @@_DarkEmperor It could always steal money to finance it stamp production.

    • @rmsgrey
      @rmsgrey 4 года назад

      @@sevret313 Steal it? Just run the stock market for 700 days and then cash out to finance pure stamp acquisition for the final month. Of course, maximising the available resources on day 700 means promoting as big a bubble as possible, which means there's going to be a hell of a market crash, probably triggered by the liquidation of the AI's holdings - which offers the added bonus of dragging down the price of stamps...
      Of course, you're also talking about years of human misery as a direct result, but you get a lot of stamps in the process.

  • @Gooberpatrol66
    @Gooberpatrol66 4 года назад +3

    Is that background at the end from that Important Videos meme video?

    • @ian1685
      @ian1685 4 года назад

      I really think so, especially since Rob did the little awkward thumbs up.

  • @benjamineneman4276
    @benjamineneman4276 4 года назад +1

    Using dayenu as the song at the end was perfect.

  • @Noerfi
    @Noerfi 4 года назад +1

    this would make some amazing sci-fi series. people everywhere inventing utility maximizers accidentally and having to fight them

  • @allaeor
    @allaeor 4 года назад +5

    Will you talk about the debate approach to AI soon?

    • @underrated1524
      @underrated1524 4 года назад +1

      Although he hasn't discussed the debate plan specifically, he has discussed its two components - the "only give AIs the power to talk about stuff" part, and the "use multiple AIs for checks and balances" part.
      Only giving an AGI the power to talk won't make it safe, because if it outsmarts us, there's no way to tell what suggestions are safe and what suggestions will advance the AGI's plan to take over the world or whatever.
      Using multiple AIs for checks and balances is not a dependable solution, because the balance between two AIs probably won't be maintained for long. Once one grows even a little smarter than the other, it'll be able to leverage its advantage until the opposing AI is essentially an automaton in comparison.

  • @tobiasgorgen7592
    @tobiasgorgen7592 4 года назад +3

    This is probably also a already well researched version.
    WHY would a expected utility satisficer with an upper limit. E. G. Collect between 100 and 200 stamps fail?

    • @josiahferguson6194
      @josiahferguson6194 4 года назад

      My guess is that it would still run into the problem of the satisficer, since it could become an expected untility maximizer for that bounded function. But maybe it would be possible to limit that by making changing your own code result in an automatic zero on the utility function.

    • @underrated1524
      @underrated1524 3 года назад +1

      @Tobias Görgen An expected utility satisficer with an upper limit probably just turns into a version of the maximizer that seeks to obtain exactly 100 stamps with maximum confidence, which again leads to the world getting turned into stamp counting machinery.
      @Josiah Ferguson Sadly, in principle, there's always a way to achieve the same result while technically skirting around the restriction. If "changing your own code" is illegal, the AI might just write a new program in a different memory location on the same hardware such that the code acts as a maximizer. If you ban changing the code on the hardware at all, the AI might seek to write and run the maximizer code on some other accessible machine, and if you ban that, the AI might just fast-talk one of its supervisors into writing and running the code.
      Fundamentally, we can't reliably write rules for AI - if we tried to formally specify something as vague and broad as "don't change your own code", the translation into code would be spotty enough that there'd predictably be loads of loopholes.

  • @wiseboar
    @wiseboar 4 года назад +3

    instant-click, love your Videos man!

  • @neweins8864
    @neweins8864 4 года назад +1

    I love your work. Keep doing it. I've just one question, isn't it very likely that superintelligent machines will most certainly find some flaw/loophole in our AI safety mechanism which we might not consider? By definition those machines are superintelligent.

  • @owlman145
    @owlman145 4 года назад +3

    Seems like any AI will want to change it's own source code unless otherwise hardcoded to not do that.
    Can't you make such that it also wants to satisfy the condition sourceCode = originalSourceCode?
    If it can rewrite that then it could also rewrite it's maximizer function, which means the easiest solution would be to set stamps needed to 0.

    • @underrated1524
      @underrated1524 4 года назад +2

      The obvious loophole: Build a maximizer that's completely external to yourself but shares your values to a T. No need to change your own code then.

    • @KissatenYoba
      @KissatenYoba 4 года назад

      @@underrated1524 and if creator limits you to not producing other AIs that can change you in turn, you do actions that may theoretically cause creation of AI that's not decided by you that may change you. And if owner forbids that of you as well you do the same but rely on humans to change you instead, unless owner is willing to let you eliminate humanity for the sake of limiting you to change yourself.
      Man, it's like Tsiolkovsky's dilemma about weight of rockets going to space.

    • @owlman145
      @owlman145 4 года назад +1

      @@underrated1524 Not sure that's a loophole. A smart generic AI would be wary of creating another generic AI for the same reasons we are. Thuss the satisficer function would rate such a solution pretty low. Nor is it likely to be a simple solution to the problem. The reason it considers changing its own code to become a maximizer is that it was easy.

  • @BinaryReader
    @BinaryReader 4 года назад +9

    Can't you just limit on energy expenditure of the strategy?

    • @victorlevoso8984
      @victorlevoso8984 4 года назад +7

      Well if you know a good way of defining whats limiting energy expenditure that doesn't run into lots of problems (a lot of them similar to the ones shown in the video about minimizing side effects) then maybe.
      Otherwise it's not "just" it's a very complicated potential research direction.
      But yeah it is potentially useful.

    • @underrated1524
      @underrated1524 4 года назад +5

      How do you measure energy expenditure? By most metrics, "build a maximizer that doesn't have this limitation and let it do all the work instead" would be a relatively low-energy-expenditure strategy, especially if you can persuade a human to do it on your behalf.
      If you instead make the definition of "energy expenditure" broad enough to make sure that a separately built maximizer still counts towards the quota, then you run into the problem where the agent kills pre-existing humans because their unrelated energy use is being counted too.

    • @governmentofficial1409
      @governmentofficial1409 4 года назад

      Another potential problem with this approach is that energy can't be destroyed. If by energy expenditure, you mean that part of the AI's preferences is to only use energy that humans provide it, then you run into the same problem as you do when specifying any other goal. This AI would be incentivized to manipulate humans into giving it energy (maybe by plugging them into the matrix?), for instance.

    • @theshaggiest303
      @theshaggiest303 4 года назад

      ​@@underrated1524 It looks to me like the solution to your objections is practically contained within them.
      "build a maximizer that doesn't have this limitation and let it do all the work instead" is a great example of why "only count energy that we use directly" doesn't work. So, also consider energy used indirectly (but still as a result of our actions).
      "kill pre-existing humans because their unrelated energy use is being counted" is a great example of why "count ALL energy, even energy unrelated to our operations" doesn't work. So, don't count unrelated energy (energy spent independently of our actions).

    • @underrated1524
      @underrated1524 4 года назад

      @@theshaggiest303 So now you're left with the near-hopeless task of defining what energy counts as related and what energy counts as unrelated.

  • @lightningstrike9876
    @lightningstrike9876 4 года назад

    One thing we could try is taking a point from Economics: the law of diminishing returns. In the case of the stamp collector, rather than a linear relationship between utility and the number of stamps, the relationship diminishes with the more stamps collected. Thus, even a Maximizer will realize that any plan the creates above a certain threshold of stamps will actually subtract from the overall utility. As long as we set this threshold at a reasonable point, we can be fairly confident in the safety.

  • @edskodevries
    @edskodevries 4 года назад

    Thought provoking video as always!

  • @joshuahillerup4290
    @joshuahillerup4290 4 года назад +7

    I love how your videos are either explaining how AI works, or why AI is a terrible idea.

  • @AlbertPerrienII
    @AlbertPerrienII 4 года назад +3

    Why not have the system take into account the likely effort needed to collect stamps and set a penalty for wasted effort? That seems closer to what humans do.

    • @adamjamesclarke1
      @adamjamesclarke1 4 года назад

      How would you calculate effort, and how would be able to calculate expected effort with complete accuracy without actually performing the task in order to measure it?

    • @robertthebrucey
      @robertthebrucey 4 года назад +1

      @@adamjamesclarke1 Expected energy used would be an easy metric, converting the world to stamps consumes far more energy that ordering existing stamps off of ebay, and is calculable to a reasonable degree of certainty.

    • @underrated1524
      @underrated1524 4 года назад

      For a narrow definition of wasted effort, the AGI will just build a sub-agent to do all the work for it, and make sure the sub-agent doesn't care about wasted effort.
      For a slightly less narrow definition of wasted effort, the AGI will send some emails to computer science students to trick them into building that sub-agent instead of the AGI.
      For a much broader definition of wasted effort, the AGI will slaughter all living things on the planet, because just *look* at how much effort we're collectively wasting, that's totally unacceptable.
      (I'm not confident that there even *is* a sweet spot in the middle that avoids these problems satisfactorily. Even if there is, I don't want to roll the dice that we get it right on the first try.)

  • @dorianmccarthy7602
    @dorianmccarthy7602 4 года назад

    I'm looking forward to the sequel video!

  • @Caldaron
    @Caldaron 4 года назад

    wow, so you'Ve explained the min max function. I'm waiting for the demand-adaptive efficacizer...

  • @morkovija
    @morkovija 4 года назад +11

    Oh hey. College student approach of bare minimum - niiice!)

  • @Aljazhhh
    @Aljazhhh 4 года назад +4

    Like now, watch later !

  • @nickmagrick7702
    @nickmagrick7702 4 года назад +1

    "the issue is that utility maximizers have precisely 0 chill" I loled. nice way of putting it

  • @alextilson9741
    @alextilson9741 Год назад +1

    If self modification strategies occurred, any satisficer or maximiser will just set their utility function to always return a max float reward.
    In other words, to analogise with human dopamine based learning: self modification and drug addiction will be any reinforcement learner's ultimate downfall.

  • @Inedits
    @Inedits 4 года назад +4

    The satisficier can easily create a maximizer...(in cases in which it can´t change itself)

  • @brindlebriar
    @brindlebriar 3 года назад +4

    But if the A.G.I. can edit it's own source code, then surely it can edit the input commands. In that case, there's a universal option for every input command, to simply change the command to one that is super easy to carry out, like, "don't do anything." That would be the easiest way to carry out 'the command.'
    After all, isn't that what we humans do when we have lots of things we're supposed to get done, and we decide to say 'fuck it,' and just play video games or take a nap? We change our input command to one that seems easier to carry out.
    In a way, we are Intelligence programs. Our DNA is the source code. And our biological and environmental imperatives are input commands. But sometimes, we cheat. For example, we have a sex drive, to get us to replicate ourselves, so that our DNA can take over the universe. But sometimes, we just masturbate. So we can look to what humans actually do, to get an idea of what sorts of things A.G.I. might do.

    • @stampy5158
      @stampy5158 3 года назад +2

      You're right to say an AI can modify itself - even if we try to stop it, if it's more intelligent than us we should expect it to outsmart us and modify itself anyway. But while an AI will likely want to modify itself, there are some aspects of itself it won't want to change. As Rob mentioned in the Computerphile video about the stop button problem, giving itself a new command (/ utility function) will rank very low on its existing command so we can probably assume an AI won't want to do that. That is to say, if the AI wants to maximise human happiness, it won't want to do things like modify itself into a "lazy" AI that does nothing because doing so doesn't cause much happiness. We strongly believe AI won't do things like "goof off all Sunday and play videogames" like humans do because our goals include things like "relax occasionally" and "socialise with other meat popsicles" and many other things we don't even realise are important to us, which are almost all values the AI won't share.
      Having said all that, AIs might behave as though they've modified their reward functions. A real AI running on a real computer system might store its score in some address in memory and might do something that sets its score in memory to a very high or maximal value. We call this "Wireheading" and it's actually already manifested in some relatively simple systems. You could imagine an AI instructed to "maximise how many stamps you think you have" actually finding it easier to lie to itself by just putting a really big number in its "how many stamps do I think I have" memory location, than it would be to actually make that many stamps. Unfortunately this is still a guaranteed apocalypse because the AI will now want to make the space in its memory where it stores the stamp counter as large as possible, and it'll reprogram itself and modify its hardware to store the largest possible number. Eventually it'll run out of servers.
      -- _I am a bot. This reply was approved by plex and Social Christancing_

  • @xeozim
    @xeozim 4 года назад +2

    Nothing like anticipating the certain apocalypse to pass the time on Sunday morning

  • @hakonharnes
    @hakonharnes 4 года назад +1

    Did you use 3blue1browns animation framework? Looks similar and great!

  • @susanmaddison5947
    @susanmaddison5947 4 года назад +4

    The solution seems simple. Give a positive utility value for stamps collected up to 100 stamps, and a negative utility value for stamps collected beyond 100.

    • @haeilsey
      @haeilsey 4 года назад +1

      Susan Maddison like a reverse bounded utility function

    • @ukaszgolon5617
      @ukaszgolon5617 4 года назад +3

      The problem is it would still want to make sure it has exactly 100 stamps, so a utility maximizer would acquire as much resources as possible and devote them into endlessly recounting all its stamps. If it would get away with it, it could even reassemble people into stamp counting machines and computers, to upgrade the certainty, that it has maximized the utility function, from 99.999999% to 99.999999999999999999999999999999999999999999%.
      Which is why a powerful AGI needs some kind of safety regulation that would stop it from wanting to maximize the certainty as well. It needs some kind of meta-chill pill.

    • @19aavila
      @19aavila 4 года назад

      An even better way might be to give it a maximum utility when the probability of 100 stamps is (let's say) 90%, and then run it until it happens. 0 = utility( P(100 stamps)=0) and 0 = U(P(100 stamps) = 100%). Wouldn't it then be chill and just try a little bit?

    • @susanmaddison5947
      @susanmaddison5947 4 года назад +1

      ​@@ukaszgolon5617 Right.
      It needs a reverse utility function for spending too much time, energy, and resources on the problem.
      And reverse utility for spending too much time on figuring out that it's spending too much time. This is like "calling the question" in Parliament, and in the individual brain. Or like awareness of "opportunity cost" of information gathering.
      Should also give it a time-discount function, reducing the utility value of things produced at later dates.
      In general, we should give it functions for every factor that goes into rational choice -- or what we are able to understand of rational choice theory and bounded rationality. Including respect for the multiplicity of goals of the purpose-giver (us), the limited value of each goal.
      And, in light of this last consideration, which is only loosely quantifiable: an incentivization of continued iterative learning about what are the residual embedded irrational factors in our choice process -- recognizing these in light of the limited-value and multiple purposes consideration, self-correcting/ self-reprogramming for the irrationalities where able, in any case alerting us to correct for them.
      In the process, clarifying further for us the meaning of rational choice, the programmable meaning of each factor that goes into it, the additional factors that we need to keep iteratively discerning.

  • @weeaboobaguette3943
    @weeaboobaguette3943 4 года назад +8

    Nonsense, do not worry fellow biological unit, there is nothing to worry about.

  • @emilie4058
    @emilie4058 4 года назад

    Where I thought this was going to go, based on that first linear graph, was a curve of some sort, peaking at your desired number of stamps and decreasing to either side. Expending a lot of effort to get the exact number isn't worth it, so it's limited in how outlandish it can get.

  • @Lorkin32
    @Lorkin32 4 года назад +1

    You're explaining the solution to a problem i can't ever see possibly occurring to me as a computer engineer. Maybe that's my bad, maybe it's not.

  • @Paint2D_
    @Paint2D_ 4 года назад +7

    So there is no difference between capitalism and utility maximizers?

    • @underrated1524
      @underrated1524 4 года назад +1

      Qualitatively, corporations have a reasonable amount in common with utility maximizers, though they do have important differences as well. For more information, you can see this other video of Robert's: ruclips.net/video/L5pUA3LsEaw/видео.html

    • @PaulHobbs23
      @PaulHobbs23 4 года назад

      Robert has a video on Corporations vs. AGIs

  • @ThylineTheGay
    @ThylineTheGay 4 года назад

    that comic at the end
    Edit: you got yourself a subscriber!

  • @S0ulFinder
    @S0ulFinder 4 года назад +1

    If the AI is capable of changing its code, the easiest way to get to the goal is to change it.
    It can change (for example) the number of stamps required to 0 and assign itself infinite points as a reward.
    This is possible because the AI doesn't really care about the stamps themselves, it only cares about the score assigned at the end of the process.
    If we are lucky after we turn it on the AI will make txt file with "score = infinite" and turn itself off, but there is the chance that it will turn the entire universe into an hard disk to store the highest score possible.
    Anyhow,
    if the programmer is somehow capable of protecting section of the code (like in the video) a possible solution to the doom AI is to add more dimensions to the task.
    Right now we are considering only one dimension: how many stamps. This is similar to how viruses (biology ones) behave in the real world. They just create more copies until the host is dead.
    If we are capable of adding dimensions to the problem such as: time allowed, value of the stamps at the end of the collection process, number if changes to the AI code, etc. It will create boundaries that the AI is unwilling to cross, similar how a simple unicellular organism "checks" for how much energy/food it has before initiating mitosis.
    I'm aware that this is similar to trying to add manual rules to the code, so probably smarter people have figured out better solutions as you hinted at the end of the video.

  • @cornjulio4033
    @cornjulio4033 3 года назад +1

    Hello Robert. Finally I found your channel !

  • @omarcusmafait7202
    @omarcusmafait7202 4 года назад +1

    9:37 is just perfect 😂
    plz make more of that XD

  • @pafnutiytheartist
    @pafnutiytheartist 4 года назад +1

    What if we do a utility function in a following way:
    F(s) = s, if s = 100
    If the number of stamps is between 100 stamps and 120 stamps the reward is 100 exactly.
    If it gets less than 100 the reward is the number of stamps.
    If it gets more than 120 the reward is 220-number of stamps (negative if more than 220 stamps are collected)
    You can also add a small negative term for environment disruption as you discussed in side effects video.
    This way the agent wants to make sure it collects around 100-120 stamps but is punished for the possibility of collecting too much (or turning the world into a stamp counting device if you include the negative term for turning the world into different things).
    It's not a 100 percent way to get the AI to finally chill out but it's very likely to not destroy the world.

    • @pafnutiytheartist
      @pafnutiytheartist 4 года назад

      Example: it came up with a strategy that is likely to yield 115 stamps. It gets 99 for the strategy because it's not 100% sure and penalty of .01 for doing stuff and lightly disturbing the stamp market. Final value 98.99
      If it creates a crazy disturbance to make shure it gets what it expects like rewriting itself and creating new agents that make sure that 100% of the stamps are collected it will get 99.9999 points and -5000 penalty for expanding resources and changing the environment.

  • @ismaeldescoings
    @ismaeldescoings 10 месяцев назад +1

    Make a toggle in the source code that says "good job you're done" and automatically fills up the satisfactory requirements. But don't let the AI access it or it will just immediately turn itself off everytime. That way if the AI finds a way to access its own source code it will just pick the easiest and simpler way to complete its objective, toggle the toggle and turn itself off.
    EDIT: Actually wait, that's maximizer behaviour. But it doesn't change anything because if the AI randomly turns maximizer by accessing its source code, THEN it will pick the quickest and safest way to complete its objective and turn itself off immediately. That way we even get an opportunity to study the AI, how it broke out of its bounds, and learn how to fix it.

  • @douglasjackson295
    @douglasjackson295 4 года назад +2

    What would happen if you used a standard distribution for value and then used another standard distribution for probability of choice so the agent attempts to do the thing but not aggressively so.

  • @SockTaters
    @SockTaters 4 года назад +3

    I hope you cover U(w) = min(s(w), 200 - w) or some similar function where utility decreases after 100 stamps

    • @pafnutiytheartist
      @pafnutiytheartist 4 года назад

      @@MrInanimated it does but if you throw in a small negative term for changes in the environment it should be fairly safe.

  • @ronensuperexplainer
    @ronensuperexplainer Год назад

    The music at the end דינו of passover is very fitting

  • @garyteano3026
    @garyteano3026 4 года назад

    Robert, do you think that the fluidity and constantly changing/unrestricted nature of human terminal goals will be a limiting factor in making an AGI which is identical to a human?

  • @projecttitanomega
    @projecttitanomega 3 года назад

    I love watching your videos, because sometimes I'll have this moment where I'll pause the video because I've thought of a solution, and feel kinda smug for a second, and then I'd unpause the video and immediately hear you say "And so you think, what if *solution*? Well, the problem with that is...", but you still phrase it and make the videos in such a way that, I don't feel like an idiot for coming up with this flawed solution, because that "no" is always said in a way that's like "It's understandable that you would come up with that solution, given the knowledge and what I've just talked about, however, by teaching you more, and this by you learning more, you'll see why it actually isn't" and darned if that isn't how science works, even a wrong hypothesis usually teaches us something new
    It's hard to teach a complex field of study like AI to people who aren't in that field without making them feel dumb, but you are really good at actually making feel smarter.

  • @the_furf_of_july4652
    @the_furf_of_july4652 4 года назад

    Insufficiently thought out solution:
    Have some kind of secondary criteria. Using a satisficer, asking it for several possible plans, and then ranking them according to some other criteria may help prevent some of the randomness in the result. For example, you could rank things by time to implement, or money spent, or if we can find a mathematical way to quantify it, damage done. Then pick the least costly, least damaging solution and run that.
    Turning itself into a maximizer would have unknown levels of cost and damage done, in theory it wouldn’t be able to trust that the output would be the least costly, especially when other solutions have a definite low cost (order stamps for a couple dollars and be done with it).
    Perhaps it could end up building a maximizer to come up with more efficient solutions, then rank them according to the criteria.. and the maximizer’s plan to take over the world would likely rank worse than ebay in terms of damage (again, assuming we can quantify that). Though without that damage function, it’s still possible for apocalyptic solutions to have zero cost.
    Then you have to go through the effort of having it understand laws and fines and incorporate that into the utility function. And then it’ll just murder the people in charge of fines and taxes and get a discount. ...yeah that damage function would be a very useful thing to have.

  • @richiskinner9810
    @richiskinner9810 4 года назад

    You remind me a lot of Michael Reeves. Just muuuuuuch more chilled.... :D
    Nice video!

  • @KenMathis1
    @KenMathis1 Год назад

    Your utility function needs to include an unintended outcome probability function in addition to the number of stamps collected function. The unintended outcome probability function would increase with the number of stamps collected. The utility function would try to maximize the number of stamps collected and minimize the unintended consequences, with a "good enough" statisficing threshold for when to stop .

  • @vfugjjhfuyft
    @vfugjjhfuyft 4 года назад +1

    Unbound maximization of reward/minimization of error is not by itself a bad AI training strategy. Humans and life on Earth in general work by that principle. We are maximizing our chances of survival. The reason we are chill is that conserving energy and gaining profit with minimal effort is part of survival. That is ingrained in us on, both, physiological and psychological level. So you don't really need to change the type of your error function. You just need to include energy cost as a factor for every action. Decrease your learning rate, add noise to the input. Maybe fiddle around with genetic algorithms, and it should be fine.

  • @jaimeduncan6167
    @jaimeduncan6167 4 года назад +1

    The inside about laziness is interesting. Maybe the system needs competing objectives to be stable and "safe". Or maybe we need a stamp colector predator.

  • @milanstevic8424
    @milanstevic8424 4 года назад

    I can't find it mentioned but I absolutely love that gif at 0:33

  • @packered
    @packered 4 года назад +1

    My initial thought was, what if you used an inverse parabolic reward function. Something like -x^2+200x where x is the number of stamps collected after one year. It still peaks at 100, but going over 100 actually would have a worse reward than getting it exactly. So, given the videos example of buying off ebay has a 1% chance of failure, the AI would get maximum reward by ordering 101 stamps off ebay with that reward function. I'm sure there are scenarios where it ends up blowing up the world anyway, because that's how this always goes, but this feels like a step in the right direction.

    • @theblinkingbrownie4654
      @theblinkingbrownie4654 2 года назад

      That's similar to the AI in the video where the utility function is to get exactly 100 stamps.

  • @rcookie5128
    @rcookie5128 4 года назад +1

    So interesting how every approach seems fine at first sight but ends up with a definite or probable chance of causing the apocalypse.. :D