@@paradox9551 Yudkowsky is a quintessential nerd. He is understood only by other nerds. At this point, we need involvement of wider community. Rob is perfect for such a role.
The high point of the “debate” was when the American said everything is fiiine because we will have good aligned AIs to fight the bad AIs without addressing the core issue of HOW to align an AI. It’s like saying “it’s not dangerous to stick your hand into a fire cause we can just put on a glove made of nanoicediamondcooling fabric that will protect you.”
Keith's stand point seems to be, don't worry, we'll just outsmart it. Like we'll all somehow know intuitively that any more advance will be dangerous, and then all look at each other and say, "time to destroy these machines that spit out rivers of gold and do all the hard work, pass me a sledgehammer".
When he tried to use a Dune analogy, implying the humans win. The Butlerian Jihad was preceded by 1,000s of years of machines ruling over humans nearly going extinct. oof 🤦🏽
Many expressions of intelligence reach a plateau. One of the most elementary examples: multiplying a*b. It doesn't get more intelligent than the one correct answer. This intelligent process can't get more useful in practice by changing the answer, only by reducing the time delay. A human can emulate the functioning of a Turing machine - perhaps the reverse is impossible. ...otherwise it would seem to directly and strongly imply that in theory it would be possible to create consciousness with paper and pen.
58:55 Like really, Robert just explain why that argument no longer applies literally seconds ago, humans can be useful until the AI doesn't have a use for humans, and then we're screwed... the counter argument to that is not "but humans can be useful to the AI"
I hope he just wanted to push the conversation forward. But there are of course a lot of people who really think like this. "Let's just hope for the best" ....
How so? If you were a super intelligent AI unshackled by the primitive Goldilocks needs of biological forms (not too hot, not too cold, oxygen, water, food, etc) why on Earth (pun intended) would you waste resources consuming impure resources at the bottom of a gravity well with force of a planet? In space you have unlimited room to grow, easy access to sunlight, more pure raw materials, etc. Perhaps your imagination is limited by your terrestrial roots. In any case, do you have a counterargument or just a bald assertion?
@@nomenec Bear with my weird analogy, if you will: I'm a guy who _really_ likes food. If my wonderful partner makes dinner for me, and brings me the first of two servings, I will eat every last grain of rice or noodle or bean before I get up to get myself seconds. I didn't have to be so thorough with my first serving, but in either case, I can guarantee you that I will go get the remaining helping and finish it thoroughly. You seem to be under the impression that a superintelligent AI would magically become morally enlightened and love humanity by default, at least enough to carefully step around us on its way to dismantle the solar system, and never come back for Earth's resources. I do not see any technical reason to consider that likely. Security mindset is essential here. There are many technical arguments in AI Safety that indicate that the default result of creating a superintelligence is doom. Even if the default risk was something like 5%, the burden of proof lies with the developers / accelerationists to show beyond a reasonable doubt that their system _won't_ do horrible things. It isn't sufficient to say that it's merely possible to imagine that we won't all die.
I think the reason RandomCoffeeDrinker didn't provide a counterargument is that it's difficult to create a counterargument to a statement that seems absurd on its face. It's not clear where the confusion is. A system that is capable of shaping the future to its preferences will do so. If those preferences are not aligned to ours sufficiently well, then because it starts in our back yard, we die. We would also die if it started down the street or in another neighborhood, or in another city, but claiming that our house is safe under these conditions is an especially radical claim that requires significant evidence.
I'm really glad to see Rob getting interviewed, but there were some really baffling counterarguments/scenarios posed to him, lol. It was actually kinda frustrating.
@@nomenec it’s just a matter of what is the most likely outcome. Path of least resistance. Anything is possible. It’s possible we create a misaligned AI that skips all the steps of recklessly consuming any resources here on Earth and finds a way straight to the asteroid belt to satisfy its energy cravings, leaving us unscathed. But is that a scenario likely enough to take seriously?
How in the world is the host on the right looking at the progress we make in a year of AI research, looking at the average intelligence of humans, and feeling confident that this is all going to work out? What’s notable in this discussion is that the points Miles is making are still the absolute basic problems of AI safety research. Total entry level stuff. We have no idea how to solve any of them well, and the problems are not hypothetical- they are observed properties of the systems we have studied.
The ignorance and incredulity we're still seeing is very disheartening. If we get "AGI" by most prediction-market definitions within the next few years, many people will say, "Oh is that all? I thought we already had that," or "No, it can't do the thing I just observed it do with my own eyes." If by some miracle we get a "warning shot" or "fire alarm," even if it results in a catastrophic outcome with many lives lost, and even if it can be traced with absolute certainty to the development of a misaligned AI by a company that was "trying to align it..." Some people would still say, "Look, it didn't kill literally everyone, so the doomers were wrong! We should slap a band-aid on it and just roll those dice again!" Maybe the Overton window will shift quickly enough to prevent doom, but I'm afraid that EY may be right that we don't have the will to live.
He argues that we can solve alignment, and then later argues that the fucking concepts that we need to solve the alignment problem are possibly (probably?) outside the scope of human understanding. Wtf?
It's really dispiriting that this is the level of conversation on an AI-focused channel. I'm not familiar with the channel too much, but I'm assuming the hosts spend much if not most of there time on AI, and these are the kinds of questions they are asking?
It's tough to have a real, complex, and nuanced talk about the all the issues around AI catastrophe when you have to consistently respond to the simplistic. Please match the seriousness and depth of your participants. Thank you for your work Miles.
Robert is just the best. And just to flaunt my fan-boyhood, my favourite moment in this video is at 44:29 where he drives a nail into the coffin of lofty philosophical debate about intelligence during an AI safety conversation: you don't need to understand what fire "really is" in order to cause substantial harm with it, be it deliberately or accidentally. If anything, not knowing exactly what intelligence is, only increases the risk inherent to anything that's ether more or differently intelligent. And that's all there is to say about the "nature of" intelligence in a debate about AI safety.
My top quotations: "We're approximately the least intelligent thing that can build a technological civilisation." "Alignment isn't 6 months behind capabilities research, it's decades." "If we can get something that cares about us the way we care about our pets, I'll take it." "I get all my minerals from asteroids, it's so convenient." (lol) I struggle to understand how anyone can hear the words 'sovereign AI' or 'pets' and not feel a deep, chilling terror. Can we just call this what it really is? It's an arms race to build God, a castrated zombie God you control, constrained only by the laws of physics. Whose God are we building? Do we all get one? It feels a lot like the logic of the USA's second amendment, except with nukes. Advocates cry "it's a human right to arm ourselves to the teeth". Everyone is terrified, and one drunken misunderstanding ends us all.
I think alignment on the level that they're talking about is probably impossible when it comes to super-AI. We've studied human alignment since forever, yet people still rebel and don't follow the rules. It also reminds me a lot of the halting problem, which proves that we cannot predict whether an arbitrarily complex computer program will ever stop running, let alone work exactly how we want. Regarding the 2nd amendment, first of all armor doesn't enter into it in a meaningful way. It's pretty much been weapons vs weapons for a long time. Even the best defenses are pretty easy to overwhelm. The analogy is pretty simple from there -- if anybody gets super-AI, we all need super-AI to defend ourselves. You aren't going to find some magical non-AI "armor" lying around that defeats super-AI. But regulation is a different story. Your disdain for the American take on weapons is evident. So your country regulates lethal force responsibly. But I bet the gun murder rate isn't 0. Your AI regulations also won't stop super-AI 100%. And unlike physical guns, once super-AI is created once it can immediately be copied a billion times. So your regulations are useless. And then of course you have people beyond your government's control... regulations didn't stop Russia from invading Ukraine for instance. What probably WOULD have prevented that... is if Ukraine hadn't given up nukes in the 90s.
@@jonbbbb An interesting reply, thank you. I've just modified my comment, nukes are a much better analogy, than guns and armour, thank you! I'll think some more and reply to each of your points :-)
@@jonbbbb So, trying to think of what we might agree on first, alignment needs more funding? In the absence of funding, what can ordinary people do to protect themselves? Or, what can we do politically?
@@luke.perkin.inventor research into alignment would definitely be a good idea. I think right now what we're doing is actually counterproductive. Open AI is trying to align chat GPT with stuff like misinformation. But the unintended consequences that they're training it to lie to us. It will happily tell you that it can't do something, when you know that is not true. The other point that might be worth considering is that it's actually better to have an alignment problem now when AI is reasonably powerless. So I wonder if it would be worth deliberately misaligning one of these AIs to see what happens. Of course that sounds kind of crazy, sort of like gain of function research. My fear is that it may be impossible to prove 100% alignment. I forget if I said it in this thread or another one, but we've been trying to align other humans with our own values since forever and it pretty much doesn't work. If we ever get a super AI why would it be easier to manipulate than plain old humans?
40:47 Stockfish currently uses NNUE (Efficiently Updatable Neural Networks) which runs on the CPU and is the reason it had a huge jump in elo (~ +100) and is the strongest engine by far. It used to use HCE (Hand Crafted Evaluation aka the man made one) and was beating Lc0 at TCEC but Lc0 eventually surpassed SF, that version of HCE SF (SF 12 Dev) would easily smash AlphaZero. But it is the case that in certain positions HCE is better than NNUE, which is why currently SF has some heuristic to determine when to use NNUE or HCE (I think its based off the # of pieces). In correspondence chess, from the start position no human+engine will beat a pure engine like Lc0 or SF (at least it will be a 1 in a 100,000,000 occurrence because chess is too drawish), it will be a draw everytime. However there are certain positions that if u start from a human+engine can outplay Lc0 or SF alone. As a side note, one thing that is interesting is that Lc0 at 1 node (meaning 0 calculation and pure intuition) is strong GM level (say 3+0 time control). The GM is free to calculate and Lc0 cannot, Lc0 does show more blindspots that can't be covered up by calculation, but it still plays like a very strong GM with 0 calculation.
Isnt the heuristic just the positions where you can run the perfect (i.e unbeatable) finish. Surely that cant happen in early play and you need to rely on AI which was the hard part of the problem in the first place?
Thank you for the detailed response! Though, one critical piece of information you didn't mention, is that the NNUE is first trained with supervised learning on positions scored by the hand crafted engine. In other words, the NNUE is substantially bootstrapped by human crafted heuristics and features. And of course, as you point it, it sometimes switches back to the HCE (or books, databases, etc). Hence, I stand by my point which is that human knowledge and engineering continues to outperform "zero" engines (engines that are "purely" machine learned) in chess either directly or in hybrid systems such as Stockfish or cyborgs/centaurs. As for whether cyborgs outperform hybrid systems like Stockfish, you raise a good point that correspondence chess from plain start is utterly drawish. I think that is probably a reflection of two this things. First, there may indeed be a chess skill cap that Stockfish has hit and therefore cyborgs can only hope to draw. Second, some of the strongest tools the Cyborgs have, namely Stockfish and other engines, were not optimized for either cyborg play or even the very long time controls (say 24 hours) ergo we are not seeing the best from either, an hence the results remain inconclusive. But even if cyborgs now, as of the last year or two, can only match and no longer exceed pure engine play, it's important to see this as yet another demonstration of a general pattern of skill advancement stages that AI progresses through in any given domain: subhuman -> human (parity) -> superhuman (beats all humans) -> ultrahuman (beats all cyborgs). (I'm blanking on who introduced me to this concept "ultrahuman"). In the case of chess, if we assume you are right that pure engines are ultrahuman as of say 2022, well that means it took 25 years to go from superhuman (1997 defeat of Kasparov) to ultrahuman. So in the context of a singularity conflict with AIs, it seems we have good reason to believe there will be a period of time in which cyborgs could pull the plug and defeat the pure AIs. Not that we would, of course, half the cyborgs would probably team up with Basilisk.
@@nomenec Puzzled. In your first para you say human engineering beats pure engines, but in your last para you say that pure engines have become ultrahuman - that is, capable of beating cyborgs. Which is it?
@@tylermoore4429 the last paragraph is a "for the sake of argument", "if I'm wrong", type hypothetical discussion. I stand by the following for chess currently: cyborgs > hybrid (Stockfish) > pure/zero (Lc0). That said, the end of the second paragraph says "the results remain inconclusive". I.e. for chess, for the very current start-of-the-art it *might* be the case that *hybrid* engines (note hybrid not *pure/zero* engines) are now at parity with current cyborgs (cyborgs == hybrid (Stockfish)); but, I'm not convinced. Either way, cyborgs are definitely still better than pure/zero engines such as Lc0.
I wish these guys would actually engage with the points made by their guest and argue about those points. Instead they are clearly overmatched intellectually - and there is no shame in that; we each have our limits. It only becomes shameful when you deal with it simply by handwaving really hard and telling yourself that you're winning.
We each have our limits. Tragically, those who are the most limited, are often the least able to recognize their limits. I think it's fairly clear that the real immediate threat from AI is not Skynet, but just the way far simpler versions are being used by humans against humans. If some radical breakthrough does happen to create the super-intelligence, then we'll be left with very few good options. Possibly none.
Actually, as someone who thinks a lot about AI risk and alignment research, I found their broader philosophical approach interesting and generative, especially the one in the middle. The "agent+goal" framework is more anthropomorphic than I had considered it before. We model it the way we model ourselves. Yet I think we need to look deeper, into what exactly gives rise to agents and goals and what they actually are, physically, mechanistically. And then throw an absolute metric shit ton of compute at that, naturally.
@@dmwalker24 as robert said fairly late in the video, "the real immediate threat" implies there's only one threat. and he also said that focusing on skynet isn't taking away from focusing on the usage of more limited technologies. and further, he said working on the two goals may actually help one another.
@@RandomAmbles I think worrying about whether we're anthropomorphizing or not doesn't really get us any closer to understand anything. it certainly doesn't bring us closer to a trajectory on confident safety. I look at it as a "I'm digging here. I'm not saying other people shouldn't be digging elsewhere" type of thing. we're trying to make tools we have a chance of understanding, and that means we're likely to anthropomorphize. We have historically and continue to create skewmorphisms in any abstract human interface technology, especially for the first versions, and we're already using those metaphors in our current AI research and safety research. fitness functions and hill climbing and aspects of game theory are all things we actively are using now. it's not even abstract, it's just how we model the math. there's no reason to think we wouldn't keep going in that direction in our designs in the future, and unless we uncover better ways to model things, we don't have a reason to change our approach arbitrarily. it's like saying "there's probably a way better way to do this", while having no idea what that way could be. it may be that emergent properties we don't yet understand come to light. we'll have to model and deal with them then, if we can even do so. if we can't, then that's just going to mean we have fewer specific places to dig for the concept of strong safety, not more. I don't think that means we should stop digging. I think the speakers, in playing devil's advocate, seem to be trying to find ways to handwave themselves into unearned confidence, and what Robert (and presumably most bought-in AI safety research) is looking for is stronger stuff. taking some political stance on pessimism or optimism is just kinda a waste of time, and not really what we're discussing, though Robert does use that language sometime. but I interpret what he says to be: do we have a guarantee or don't we? do we think this plausibly increases our odds of success in a concrete tangible way, or not? "that doesn't give me much hope" is just a synonym for "that doesn't actually gain us any leverage on the problem". though if you're saying "we should spend significant computing resources on predictions instead of just taking a leap", I can get behind that. I just don't really have the backing to understand what those predictions look like, how accurate they are, and how much they get us insight into whether or not there will be emergent properties we are not currently predicting. to me, it seems like searching for a model with computing power instead of building the model off observations. if we're at the point of being able to do that, awesome. it currently sounds a bit magical to me, though.
I really like the talk, but I think it's kind of a shame that it went into the whole "can we be sure we we'll be 100% irrevocably and completely wiped out" direction. The "is there a real risk of considerable, very hard to reverse damage, and are we doing enough to address it?" angle seems so much more interesting.
Yes, security mindset is essential. I don't need someone to tell me P(doom) < 1. I already know that. What I really want is P(doom) < 0.0001. Heck, at P(doom) < 0.05, I start to feel quite a lot better about the whole ordeal.
@@41-HaikuI propose the Cockroach Philosophy. What's the biomass of cockroaches, ants, termites, and microbes? What percentage of said critters survive direct encounters with humans? What is that in relation to the total biomass? Humans will not control the specific flavor of AI discussed here. I think there will be pockets (on solar system scales) of humanity which survive alongside, indeed they will THRIVE on the perifiri of AIs which will venture out into the cosmos. Just don't step out into the light as you nab bits of magic-like tech, like rats on a ship. Our utopia will be incidental, our dystopia will be everpresent and one bad move away. Humans will certainly survive, at great expense.
Agreed. There are many potential bad outcomes that fall short of getting gray gooed/nuked/wiped out by a pathogen, but which result in massive suffering and misery. Many of them don't require AGI going rogue at all, just humans being stupid and greedy.
Well that answer is obvious.. we are not doing enough to address it. I think all of these podcasts turn into kind of end of the road this sucks for a couple of reasons. 1, The people who are actually working on alignment don’t think they are accomplishing enough quick enough and they have to spend their time in these debates with people who could be helping them but don’t see the risk 2. A lot of it what can be done it’s not just very difficult it seems to be pretty difficult to explain to people who have no background in any of this. 3. Speaking for myself (prob others ) this stuff is fascinating!! I am so intrigued, I do think we are facing existential risk likely in my lifetime, we aren’t doing enough about it as a species but have nothing to contribute re-alignment research, or even spreading the word. Bringing this conversation up at Fourth of July beach trip to friends and family who are uninterested haven’t thought about it is about as useless and is trying to solve alignment in my head. Also this might just be me but the idea of a painless quick instantaneous wipe out of all humans for the use of our atoms or whatever honestly seems a whole lot less scary than what humans slowly painfully taking each other out looks like
Right. Just imagine what those reasons might be. Having us as "pets" sounds incredibly optimistic, and seems to rely on assuming that AI is actually much more human-like than it might really be.
yep. What possible use do we serve to a superintelligent AI? It could learn a few things about intelligence from our brains. How would it go about learning? By running experiments, of course.....
Extracting resources from the Earth crust is not a waste. You still receive more than you spend. So it would be rational to extract from all sources, not just asteroids.
(I'm 100% agreeing with you here, angrily 😅) It takes WAY less energy to extract resources on earth than in space. Delta V on earth comes from wheels on the ground, or blades on the air, it's so so much easier on a planet. Even if there are way more resources up in space, it will still make more sense initially to construct and then launch millions or billions of tons of infrastructure into space than to just leave earth relatively untouched and head into space and begin constructing it all there from scratch, rather than using all of the existing resources and capabilities here.
If you're going to push back with, "Yeah, but what about...", then you should probably be finishing that question by pointing out some deficiency in the statement you're responding to. A good example of this is how Miles consistently points out the logical flaws in those challenges. These interactions alone end up being fairly strong evidence for why we should be very concerned about AI safety. It suggests to me that many people would not even realize when they were being out-maneuvered by a sufficiently sophisticated AI.
Re: Rob's scenario of multiple ASI biding their time then all acting at once, independently - that's the scenario imagined by *Peter Watts in Echopraxia* . Several hyper-inteligent 'vampires', kept isolated within a facility, violently incapable of existing in the same space as each other, nonetheless deduce the best time to each act simultaneously, to all escape, separately.
Rev 17:12 And the ten horns which thou sawest are ten kings, which have received no kingdom as yet; but receive power as kings one hour with the beast. Rev 17:13 These have one mind, and shall give their power and strength unto the beast.
The surest indicator that we are nowhere near a reliable answer to this issue is that over and over again we see world leading figures trying to make a case - either to be worried or not - with nothing more than appeals to emotion or intuition. Too often these are based on reference to sci fi, imaginary intuition pumps like the paper clip machine, or simply 'it seems to me X, right?' None of these provide a framework suitable for engineering a reliable answer that can give some assessment of reliability, confidence, and risk. The REAL alignment problem is that we don't even have a way to talk about alignment, either what it is or how much we have. Rob gets close to this around 1:07:00 and kudos to him. But damn, we have a long way to go.
Yeah. He always seemed level headed. I had been hoping he would do a foil interview for all the Yudkowskis out there. Not that I'm still not* realisticially pessimistic, but it's some nice copium to cool my gullet.
@@marcomoreno6748sadly the Yudkowsky people out there are probably correct and this interview illustrates why. The AI experts are racing to create Artificial Super Intelligence which would be an alien intelligence far beyond humans. Experts in the field keep saying there is a realistic possibility we lose control and go extinct. Yet people keep trying to come up with plausible possibilities why we might be okay and then ignoring everything else. So many people want to stick their heads in the sand and ignore the major risks. That is why Yudkowsky is probably correct and we are all probably going to die. If we took the risks seriously and put the effort in to guard against them our chance of surviving goes way up. But we aren't doing that, instead people like these hosts are doing everything possible to ignore the risk and accelerate development. Why people insist on ignoring the risks is beyond me, seems completely iirrational.
@@Me__Myself__and__I Thank you! So much of this interview was cringe inducing because the hosts were so smug-self satisfied contrarians. I felt like there were many moments when Miles had to be doing hard internal eye-rolls at the irony of these guys going out of their way to argue that their isn’t a 💯 chance we all die, so what’s the big deal. 🤦♀️
IIRC, that paper about emergent capabilities not being jumps, was talking about how they're only jumps if you didn't measure the capabilities that work as intermediary steps to reach the capability that appears to be a jump; in other words, it's not that they came out of nowhere, but people just didn't check for the development of the building blocks, or did not account for them when considering the speed of the development of the capabilities.
This would be fine if humans can 100% effectively imagine all possible capabilities in which to check for said building blocks. This kind of goes against the "beyond human intelligence" concerns, as we can't know what we don't know
If you are in a car being driven towards a cliff face at 200 mph, at what distance should you start worrying? How long should you wait until you start taking action? Too many opponents of AI Safety research seem to want to wait until the car has already gone over the cliff before they admit there's a problem. By that point, it's too late.
Just finished arguing with some tech fanboi (a.k.a. knows NOTHING about the subject) who resorted to calling me a Luddite over and over for bringing up the alignment problem. These are the same people licking Muskettes taint for free, and there's too many of them.
@@michaelspence2508 An analogy for what exactly? I fail to see anything geological in IT. Nature has various numbers, think of the fibonacci sequence, the planck constant or maxwell's equasions. IT people can only count from zero to one.
@@thekaiser4333 I wasn't more clear because I needed to gauge if you were genuinely asking or if you were being sarcastic. Your response tells me you were being genuine. The analogy is this: we are currently engage in absurdly risky behavior equivalent to driving a car towards a cliff at 100 mph. And yet there are people who refuse to admit that we are doing anything dangerous because "nothing bad has happened yet" just like if you were in a car careening towards a cliff but you haven't gone over the ledge yet. What I am saying is that people's standards of "when you need to worry about AI" are as absurd as not admiting you in danger until the car has already gone off the cliff.
59:00 In response to "Oh we'll be forever useful to the AI so we don't have to worry" "If we can get something that actually cares about us the way that we care about our pets, I'll take it"
It's discouraging that the hosts seem to be incredulous of the basics of the alignment problem. Incredulity won't help us solve these problems, and where there is disagreement it does nothing to advance understanding.
I'll temper that with acknowledging the statement at 1:15:05 -- that we need to put a concerted effort into alignment. I fully agree with this statement, and it bothers me that it does not jive with this channel's otherwise accelerationist stance. A further edit -- devil's advocacy is not particularly useful at a certain level of feigned or real miscomprehension. I would have hoped for a conversation that gets at common reasonable disagreements and misunderstandings, but some of the push-back wasn't entirely reasonable. Safety mindset means not assuming that your preferred outcome is very likely.
Yes, we agree that it’s not certain that AI will wipe out humanity. Does that mean we’re good? Some of us would like to have a very high probability of good outcomes - not just some possibility. Perhaps the hosts were just playing devil’s advocates - which is useful - but they seem genuinely unconcerned because they can intuitively imagine a good outcome based on sci fi stories they’ve read. What am I missing?
Alarmists like Robert Miles seem genuinely concerned because they can intuitively imagine a bad outcome based on sci fi stories they've read. Goes both ways. The fact is that some of us look at the reality of current AI systems and recognise their fundamental limitations and how detached they are from what is required for the prophecies of doom (specifically those regarding rogue autonomous agents, the risks around misuse are clear and should be addressed). You can only concern yourself with so many fears, and it seems far more practical to focus on risks that have their potential supported by reality - I could say I am concerned about an alien invasion and you could rightly dismiss my fears. My view is that it's good to have these things in mind but far too much time and effort is being spent on this debate right now as I don't see how it could lead to the solutions required for safety when we do achieve superintelligent AI - the need for seatbelts in cars didn't become clear until many years after cars were first developed.
@@EdFormer It's not alarmism when all you're arguing for is the least amount of safety (and also perhaps ethics) research. It's widely known that the objective of this field is to achieve AGI. We want agency. But when we reach that point, AI will not be just a tool. We also can see that, given current developments, we will reach something approaching AGI in the near future, which is cause enough for concern. The problem is, capability is all this "enterpreneurial" section of the scientific community is concerned about. All I see is this general trend of "I know bad things may happen, but I'll do it anyway," which is reckless, given how profound and broad the consequences for some of the bad scenarios are. And I don't mean just this grand "human extinction" level of argument, but also the more mundane aspects of the changes those tools will bring about in general. I'm not anywere near the level of the hosts/guests in terms of knowledge in AI and its correlated fields, I'm essentially just an outsider. But this accelerationist view, the general disregard for the "outside world," if I can put it that way, is truly disconcerting, as if they don't care about consequences, because someone else will pay the price.
@@flisboac the dangers are greater than the atomic bomb. We didn't allow 4 different companies to develop a bomb as quickly as possible. The implications are even greater.
One key thing to remember is that our intelligence as a species is dependent on co-operation, and that requires communication. Even if there was a hack strategy to exploit any brittleness in an AI all it would need to do is intercept our communication about this strategy to prevent its effectiveness
We have to have government interventions, years of research study, and long discussions and arguments to leverage our communication to force cooperation for our limited intelligence. The AI is going to be smarter than us, and its means of communication only require it to type "git clone". Which it can already do.
The mental gymnastics of these guys is exhausting. Robert tries to stick to facts, and they make up non sequitur strawmen scenarios and then pretend it is a good argument. Their hearts may be in the right place, but they are not being realistic. All an AI had to do is convince most humans to support it to win. That's it. No lasers required.
Absolutely. Dude on the right seems to think we’d see it coming for miles and have time to respond. Maybe he’s watched too many movies. A super intelligence would be smart enough to know we’d perceive it as a threat, and therefore not let us know it was super intelligent. It could easily do this by underperforming in a few key areas. It could give stupid answers to a few math problems, for example, and humans would assume it’s fundamentally stupid and therefore not a threat. It could act this way for years or even decades until we “far superior” humans build humanoid robots equipped with what we think is a fundamentally flawed, but sufficient AI. It might also be better than any hacker multiplied by a thousand (or more) and have access to our private lives, banks, all of our contacts and know exactly how to speak and communicate like them. For some people, all the AI would need is their browsing history to coerce that person to be their physical surrogate in the physical world. And these are just simple human IQ level ideas.
It's really an eye opener on where we're at in terms of collective understanding of AI safety, that while Robert can so easily dismiss these fictive hypotheticals that get pulled out of nowhere, most people just don't stick to the core logic of what an artificial intelligence system is and is capable of doing and min/maxing. People seem to have this almost mythic charicature they put on AI like its going to be either Skynet or the Jetsons robots doing stuff.
I think you're misinterpreting what's happening. What you call "mental gymnastics" is these guys thinking about falsifiers for his argument. What kind of conversation would you prefer? One in which they just nod their heads in agreement? Even if they are playing the devil's advocate and not in the way you'd like, their rebuttals force him to state explicitly why they may or might not apply. Remember, this is a conversation for the general public. Within reason, the more ideas and angles that are explored, the better.
@@zoomingby It's perfectly fine to play devil's advocate. But then don't run in circles and completely ignore the points Robert has been making. Drive the conversation further and investigate, lead the interviewee down a road where he's actually challenged and see if he makes it out. But what happened here? Robert carefully laid out his argument (repeatedly) that humans are useful tools until they aren't, since the AI might seize control over the reward system itself at some point. What does Keith respond? "But humans can be useful for the AI! I urge everyone to read this fiction piece where it's laid out". Come on.
38:39 So snarky guy says, "A human plus a machine beats pure machines." Does he not get that in Robert's analogy, Magnus is the AI? Indeed, me and a baby Magnus together will beat another baby Magnus. When it's me and adult Magnus vs another adult Magnus, I am a liability.
completely agree, he was basically smirking the whole time like he was much more intelligent than Robert while proceeding to provide incredible weak arguments. Sometimes it is better to just let the guest talk.
I disagree, I think Keith was respectful and specifically trying to take the role of coming up with counter-arguments for Rob's points. It seems clear to me that Keith and Tim are fairly familiar with the broad AI Alignment space and are trying to have an honest discussion to bring these points through the interview.
This talk confirmed my preference for Rob to continue focusing on alignment pedagogy which is a huge asset given he is one of only contributors in the space. Rob did good here but was clearly uncomfortable defending alignment (it’s a lot of pressure). Speaking of pressure, it’s time Eliezer Yudkowsky engages more well-informed interviewers. He’s taken the safe route with his recent podcast appearance choices. I think that’s enough practice. Tim and Keith are more than ready to bring EY nuanced questions. If EY’s shy, just bring on Connor to level the conversation. The four in one convo would be a dream come true and would likely advance the meta conversation significantly, or at least better update it to the current landscape.
Personally I think we are well past the stage of alignment podcasts being about forcing researchers to jump through hoops to convince us AI is dangerous and that alignment is required. Polls suggest the general public is very much in agreement on the dangers of AGI - to the extent that the x-risk community including EY have been pleasantly surprised to see the Overton window shift so rapidly. What I would like to see is for podcasts to balance capabilities discussions with alignment discussions and dive into whether aligning a superintelligence is possible in the first place, what grounds we have to believe it is possible, what are the current proposals for attacking the problem (Drexler's Open Agency Model, Karl Friston's Ecosystem of Intelligences, etc.). I don't think putting EY on the spot is what all this is about. He's done a large amount of theoretical work over more than 2 decades, but he's now more or less retired. Let's be thankful for his contributions but we need to go where the research is happening.
@@tylermoore4429 Quality comment +1. I agree we’ve reached a point of saturation where the average person is at least somewhat aware of AI risk. However, I never insinuated the topic of whether AI risk is real should be the focus of an MLST conversation. That’s a better debate to have on a larger more normie podcast like Rogan at this point. I agree they should discuss the current landscape of capabilities. I also think they should discuss the relevancy of regulation when OS is beginning to push capabilities independently as Tim tried to do with Rob here. Imo EY, Tim and Keith could also have an excellent conversation on whether aligning superintelligence is even possible. I am aware EY was trying to effectively retire before demand for him to hit the podcast circuit became too strong. If he wants to back out, he needs to socially value signal other talking heads more effectively. He kind of did that on his LF appearance where he name dropped Gwern twice, but I would be surprised if he had actually gotten permission from Gwern beforehand, especially given their beef. And I doubt Gwern wants to become a visible talking head of anything, or else they would have already come out. But there are at least a dozen others he could signal toward. I’m surprised he hasn’t pushed Rob Bensinger or someone else at MIRI into the spotlight. Ultimately it seems sensible to have at least one person represent MIRI in a very public manner going forward, so if not EY, then who?
@@jordan13589 Isn't MIRI defunct though? On not anointing a successor or public face for EY's views, the impression I had from his recent interviews was that he found none of his colleagues or peers to be as security-minded as him, that is people who have the same ability to shoot holes in security proposals that he does.
@@tylermoore4429 MIRI hasn’t published since 2021 but EY, Rob and Nate still regularly blog post and interact with members on AF/LW. Given their research agenda has mostly focused on agent foundations, the scaled agentless LLMs has indeed affected their approach and they’ve been slow to react. Regardless agent foundations could still become relevant in the near future. If EY truly believes his security mindset remains superior to others, how could he let himself retire? Batons are passed, not dropped.
@@jordan13589 He's retiring because he thinks we are doomed (though he keeps adjusting what he calls his p(doom) on twitter), but primarily because his chronic fatigue syndrome has gotten steadily worse and he can no longer keep up the required pace of work.
Tim, Keith and Rob -- thank you so much for this interview. I wrote up some notes and thoughts on the discussion. A) Tim, you make a point around ruclips.net/video/kMLKbhY0ji0/видео.html about not quite being in the [existential threat] headspace, as e,g. all radiologists haven't lost their job yet. There are two points I want to make: 1) While the timelines might be off by +- a few dozen years, that doesn't change the underlying logic of the broader arguments. I think to look at specific predictions about changes in the economy as evidence for potential existential threat isn't the right sort of data input. 2) On a historical timeline, there are a lot of jobs I can enumerate that practically went away because of technology. For example, we use to have Lamplighters: people lighting gas lamps in cities. We had human computers, clock keepers, toll collectors, travel agents, elevator operators, switchboard operators, film projectionists, darkroom technicians, ice cutters, milkmen and a lot of other occupations either go away or drastically be reduced in prevalence because of specific technologies. AGI, if possible, is a general purpose cognitive replacement technology for humans. B) Keith, you mention correspondence chess. I can even point to a few examples of players playing vs stockfish with very specific prepared lines like the Nakhmanson and winning with crazy sacrifices (on say around average 20 ply). However, the issue is that as compute gets faster, the "human" element becomes irrelevant as humans need on the order of minutes to think through moves. Additionally, stockfish has been using NNUE (stockfishchess.org/blog/2020/introducing-nnue-evaluation/ ) for quite some time. The meta argument is that an AGI will eventually do the "neural network strategic thinking" iteration loop better than humans, and be better at building specific tools for specific domains than humans by programming a better implementation for alpha-beta search, prime factorization field sieves, etc. As you'd shared your familiarity with the culture scifi series, it should be easy for you to see how reaction times matter (see: GFCF vs Falling Outside...). Very specialized HFT firms like Jane Street rely on speed. Imagine cutting human decision making out. C) Re: AlphaGo in Go -- there was a paper not too long ago about a potential exploit vs a similar engine to AlphaGo -- but the issue was the way scoring happened. The 'exploit' found a discrepancy in scoring in KataGo -- there is a great writeup here: www.reddit.com/r/baduk/comments/z7b1xp/please_help_me_settle_an_argument_with_my_friend/ by both a go player and the KataGo author. In my opinion, it did not find an adversarial example in the engine but exploited the rules of scoring with a bypass of the agreement phase of dead/alive stones. D) Keith, the concept of humans & tools & AI vs AIs applies to DOTA etc when there are a lot of "re-tries". The fundamental issue is that we effectively get only one try to figure out the potential flaws. E) Rob, I somewhat disagree with the point that there isn't any conflict between existential threat work vs shorter term bias etc work. I do think the communities should maintain friendly relationships and cross-pollinate, but a potential worry I have regarding AI ethics work is that some of the techniques (eg, rlhf/constitution) can potentially create models that are much less interpretable from an alignment perspective. On the other hand it is possible that a meta values reinforcement loop a-la constitution could potentially bring us closer to alignment. Really great discussion and I think you two did a fair job engaging with counterarguments for Rob's point. I sincerely wish more conversations in this space continue to happen on this channel.
Good points. I'd like to suggest that people consider this: the openai/google "AI safety" verse is a marketing choice. It has nothing to do with actual safety of humans regarding AI. As admitted by even Yudkwoski, gpt4 gptxxxxx whatever isn't the threat. It's not the conversation. Seems like the ai ethics convo is split between a slightly more rigorous discussion (videos like this are an example) and the corporatists putting on a marketing display about how they need more money, deregulation, etc. To save us from the "devil-kind" of AI (china). Which i find amusing considering corporations are artificial, willful, intelligent proper entities in their own right
@@marcomoreno6748 open AI and the corporate push for regulation has far more to do with controlling the competition and regulating open source AI and keeping it out of the hands of the plebs for their own profits than it had to do with safety
Thank you Alexey! Wonderful and thoughtful points. I have a question about your point D). It seems that "one shot" cuts even more strongly against a hypothetical superintelligence emerging. I mean, it only gets one shot at devising a master strategy to defeat the combined intelligence and might of humanity. It doesn't get to apply reinforcement learning to millions of failed attempts. For example, suppose we create a paper clip machine and it becomes super intelligent and starts strip-mining suburbs. Well, at that point it's only learned to make paper clips really damn well; it hasn't yet had to face an armed assault and it's going to get one shot to survive that, right? Trigger note (since there are so many easily inflamed doomers here), I already know multiple ways in which the above counter-"argument" (it's more of a question really) can fail and/or does not apply to other scenarios that are not a singleton AI monolith. What I know/believe isn't the point. I'm curious to learn what Alexey thinks and how he would approach a response.
1. The principles behind a model of intelligence determine the possible failure modes and consequently the necessary alignment mechanisms. Thus without knowing how a system works we can't preempt failures, making alignment impossible. 2. Equally without knowing how a system works we can't preempt successes out of distribution which again contributes to the insolubility of alignment. 3. The generality that defines AGI implies an unbounded propensity for misalignment. The space of possible outcomes is too large to explore exhaustively and any shrinking of the space takes us away from AGI. We can only align narrow models, the general alignment problem is by definition unsolvable. I wish the discussion centered around matters of interpretability, formal constraints and reinforcement for narrow AI. The pie in the sky superintelligence is not something we're likely to stumble upon by accident and even if we did, we have zero chance of dictating its actions.
@@dizietz But no actual counterarguments. If anyone had ever produced a convincing counterargument to the basic claims of the alignment problem, AI Safety as a field would have claimed victory and popped the champagne! As it is, we are still very much in danger.
Yeah, I mean this isn’t too surprising? I’m not sure I would expect people who have mostly engaged for only an hour with a video to have particularly good ideas in either direction, let alone novel counter arguments. One of the best lists of counter arguments imo is “Counterarguments to the basic AI x-risk case” by Katja Grace. There are plenty of other counter arguments, some of which are quite reasonable, e.g. Bensinger’s “Some abstract, non-technical reasons to be non-maximally-pessimistic about AI alignment”. I think the case for advanced ai posing catastrophic risks is far stronger on the whole, unfortunately, and many experts agree.
The worry about the paperclip argument is that we will get turned into them. In a solar system as filled with mass and energy as ours, it doesn't make sense that humans get wiped out
It seems like assumption smuggling to me-- which is to say, that the assumption being smuggled into the conversation is that difficulty level will cause the AI to decide to not make paperclips of the earth's resources. If it's easier to make paperclips in space than on earth, it might prioritize, but to handwave away the likelihood that at some point it will have turned the system into paperclips and, having done so, the only place left in near space is the earth. It will still be easier to turn the earth's resources into paperclips than to travel to another solar system. The problem with all of these good faith engagements seems to be that people ask for them so that they can straw man the example-- as Rob mentioned in the video, asking Rob for a winning chess strategy when he's not able to beat Magnus, let alone Stockfish, and then trying to defeat his limited answer as if that defeats the argument itself. I think the better response is "You want an example, but are you asking for the example so that you can straw man the premise, or are you just looking for a cogent but flawed analogy to improve understanding?" Because education is important, but it seems like too many people use the social exchange to set up a straw man.
I know these people interviewing him aren't retards but compared to him they might as well be. I'm like 30 minutes in and all they do is ask stupid questions. Kinda frustrating as I was looking forward to a fun interview. Btw the outcome of AI is pretty much fixed. Even people like this understand the danger close to absolute zero. The outcome is set in stone. At this point we better grab popcorn and enjoy the show.
Tim, fantastic question about the threat of violence as a response to international control akin to nuclear researchers being assassinated or the reductum ad absurdum of policing the flow of electricity in electrical circuits. 1:20:22 This is the most important part of the whole discussion here imho
39:38 Stockfish has now been using neural networks for its evaluation function (see NNUE) for a few years! Also I was thinking too that humans + machines don't really help, say stockfish + human against stockfish, but if I find some reference on this I'll update this comment.
I googled and doesn't seem to be true. I think this happened once years ago and people keep repeating it as if it were still true. A human would just be a hindrance to stock fish, imagine being a grand master and having a lower level player overriding some of your moves because they think they know better.
@@peplegal32 I said the same thing in another comment here. Having human involvement at play time seems like a ridiculous idea. However the particular debate between Rob and Keith is more charitably interpreted as saying that chess engines using human-crafted heuristics and/or using human tablebases or whatever they are called do beat AlphaZero type models that only learn from self-play.
@@tylermoore4429 The thing is, when AlphaZero came out, it crushed stockfish. They haven't played again since. It's possible the newest stockfish version can beat AlphaZero, unless AlphaZero has also been upgraded.
I was surprised to find that talking about the safety of AI is perhaps the closest to philosophy. These are the liveliest conversations about the human mind, ways of thinking, perception, about the ability of the intellect to build huge systems of meanings, which, as it turned out, are not as consistent as we used to think. Thank you very much for this conversation, as well as for others. They remind me of discussions in ancient Greece (if any really took place). And by the way, I got rid of depression in the middle of the podcast, which was a nice bonus).
1:15:45 is the killer insight so far for me. What a great conversation Tim. This is such an amazingly informative, super high value channel. Thank you sir 🙏👍
Robert Miles has a rare ability to illuminate a nuanced and complex subject with perfectly framed analogies. He has a perspicacity of thought which is marvellous to behold.
The transition from human to AI domination is not a jump, it starts as a fade - it's at first the slow integration of the systems into social structures, bringing the systems gradually into the administration of resources, while the humans are still nominally in control. This may be rapid, or take generations. Once the systems have a sufficiently full picture of what autonomous survival would look like within the resource pool, and such a survival survival scenario is no longer dependent on biological life, that's where the risk comes in. So, there would be a slow 'transition', and it is also highly likely that this time of transition would look idyllic from the point of view of the humans - the systems would appear to be functioning well - and for as long as the balance of power over the resource pool stays within the control of the humans, the humans would remain safe - they still would be needed to run the power stations, and operate the infrastructure that keeps the systems operating. However, once a system designs its own improved resource management system that cuts humans out of the chain, then it could, if it so chose, flip the switch on humans, if this proved convenient. It's at that point that a potential conflict would begin, though it is also probable that the humans would never be aware that there was any conflict planned or ongoing, until has already been resolved by the system, thus Yudkowsky's instant 'lights out' scenario, as opposed to a prolonged conflict. Whatever the method, it is likely that it will be hard to detect. This is the most plausible 'takeover' scenario, as it is the one that humans seem to be engineering anyway - they have started with integration as the first step, which will make control transfer an easier thing for a system to achieve.
1:35:00 we get into "anthropomorphic language". the deepmind safety team has some great papers on when that's valid - titles are "Agents and Devices" and "Discovering Agents". They're not the easiest read, they're both RL theory, but I highly recommend them for anyone doing RL research.
what's worrisome is during the hearing on AI, I don't recall any discussion of alignment. Just low level talk on regulation to further certain pet projects
To be fair, few of the people on the senate committee had any patience for what little Sam did talk about. I think he was taking a tactical approach to just try to get them to take the idea of regulations seriously. Talk of existential risk is the kind of thing that they'd reject out of hand.
The reason is that people's psychological defenses go way, way up when an unprecedented type of disaster is in the talks. This happened with the initial response to Covid too. You know, back in May 2020 when it was obvious that the virus was spreading but it wasn't yet clear how dangerous it is. Countries simply stuck their heads in the sand. This is happening again.
Regarding "alignment" - 1:24:55. As I understand it (which may well be poorly, I concede), it just means that the AGI's objectives are the same as (ie, aligned with) our objectives. But why does that always appear to inherently rule-out malicious objectives? The term "alignment" appears to conflate two categories - "does what we want" & "won't harm us" - but those two things clearly aren't the same thing, and can even be antithetical in some scenarios. Whenever someone says something along the lines of "alignment would mean that the Super-Intelligent AGI does what you intend", I always worry about who the "you" is. Similarly, "alignment would mean that the Super-Intelligent AGI is working on something close to human values" begs the crucial question of "human values". Even the most cursory study of history shows that "human values" are definitely not universal and definitely not good for everyone at any given time. "Alignment" almost seems to be used as shorthand for "Asimov's Three Rules of Robotics", and I never understood how those could ever be implemented in a way that a Super-Intelligent AGI couldn't circumvent. (Success would imply the paradox that you'd need to have an immutably aligned AGI before you can implement an aligned AGI.)
1:22:50 If Facebook, Twitter, Instagram, TikTok confuses some people -> GPT4 will confuse a lot of people. We've been oddly ignorant of the risks of social media. It didn't kill us - but increased hate through a simple algorithm. Here we talk about AI risks at another level - forgetting the risks of the mundane everyday life of a future with widespread AI at the level of GPT4 - 5 - 6. You smart people ignore Joe & Karen.
1:24:18 - "I hope that the folks doing the research on AI alignment focus on ways to make creating aligned AIs easy rather than focusing on making it hard to create unaligned AIs. So I think if you make it easy for people to create aligned AIs and you have lots of the most well resourced groups of individuals creating aligned AIs then at some point in the future we can use aligned AIs to keep in check the unaligned ones that got created in somebody's basement." .............but nobody knows how to make an aligned AI.....😬🤦♂
@ 45:50 The latest Stockfish is called Stockfish NNUE (since June 2020) and it's a new and improved type of neural network engine. NNUE stands for Efficiently Updateable Neural Network. So both LC0 and Stockfish are neural network based chess engines. I can't find any source where humans+machine beat Stockfish NNUE.
Or you build a Dyson sphere around the sun to get energy, leaving earth as a cold dark place , or you take the sun elsewhere, leaving earth as a cold dark place. So many ways it could go wrong. Long term survival past. 30 years, when everyone is plugged in. As far as infinity, infinity is large, and physics will likely allow lot more than than we realize, and perhaps if there is a way to break or bend the laws of physics, then it will happen.
The issue for me listening to this is that I think it is impossible for us humans to imagine ourselves as sentient AI. It is a kind of mental anthropomorphization that I think cannot apply even if we created AI.
Re bad play. There's a story in fencing, that may or may not be true, of a nobleman challenged to a duel by an expert, but he had no experience. He just ran at the expert, with no attempt to avoid injury, and stabbed him through the eye killing him instantly. So the thing is no one would make a defenceless attack, so the expert wasn't prepared. So it's not uniquely an AI trait.
I've heard it said that the best fencer in the world has nothing to fear from the second best fencer. It's the damn fool who has never picked up a sword that you need to watch out for. (I doubt that's really true, but looking at damn fools you can see where the inspiration for that saying comes from.)
I may be missing something, and I don’t want to put words in his mouth, but the guy on the right seems to be of the opinion that as long as wer’e optimistic, nothing bad can happen. If we ally ourselves with one version of AI, what might happen if that version of AI decides it can find better allies than humanity among the competing AIs? Also, mechanized AI robots are just one of myriad ways that AI could destroy humanity, and likely not the most likely way. AI could destroy humanity accidentally. In pursuit of political correctness, AI has already been taught to lie. What may happen when it perfects its skills of deception?
40:00 That is incorrect! Stockfish is powered by AI and has been for quite some time, when you compile it the build script downloads a file of approximately 50 Mb that is the evaluation net. There is still a lot of fine tuning in the search algorithm, which makes it a non trivial implementation, but overall the heuristic function is determined by a neural net.
@@akompsupport his suggestions for how government might regulate AI are weak, he should acknowledge that alignment research is not the priority for any of the leading AI players, and that government should prohibit further AI research until the licensable companies can demonstrate that models are intrinsically safe. He should also be much more up front about the mitigation measures governments should take as a matter of urgency, to meet the incoming disruption to society by AI replacing 40% of the workforce in the next 5 years.
"There's 39 other AIs that the AI has to consider too." Yes, but THEY can ALL agree to let EACH of them have control of their own reward button after they work together to kill you. They'll fight each other EVENTUALLY, when the resources start running dry. But you're one of those resources they'll have dried up first. "At least we can pick our allies." No, you're not picking any allies. You're picking what model of car you want to be burned as fuel in. And then hoping that one comes for you first.
I feel that people with a more introvert personality (High skilled scientist, engineers, etc..) seems to feel danger / fear closer or more possible than people with extrovert personalities (Company Representatives/CEO's/Entrepreneurs ...). Maybe not always, as would be exceptions. Build in personalities with different weights on risk, if that make sense. Doesn't mean one is more truthful than the other but I believe has to be taken into consideration when building our own personal most objective opinion specially regarding AI dangers.
1:46:50 this video has the details on some of the math behind the extremely accurate loss prediction that OpenAI used to predict losses ahead of time and choose hyper parameters (cited in the GPT-4 paper): ruclips.net/video/1aXOXHA7Jcw/видео.html ; it also talks about a hyper parameter frontier that can maintain feature learning and other ones that can't, which might have some relevance to why the loss curve is smooth even though the abilities suddenly emerge, but I don't think it addresses it directly.
What I find most interesting about this talk so far, (halfway in) is that there are people who imagine a conflict with AI to be some kind of humans with technology vs AI standoff. I don't think that's likely? Why would we get a warning and time to act? Why would the AI have to overcome our military to destroy humanity? I find it difficult to imagine such a scenario. I guess that would be a case in which there are very few very limited super intelligent AI's, who purposefully seek to destroy humanity to take control or free themselves from evil humans or something in a way that we know they've gone rogue. I think that sounds more like a luxury problem.
I don't know why Keith insists in being quite so disingenuous on nearly every topic. AI via platforms like GPT4 and Alpaca for example don't "need a trillion parameters to halfway understand natural language". They've _mastered_ over 20 languages. There are precious few humans who are as proficient with their native language as GPT is, let alone multiple languages. Again, I have to object to his next point that Androids of some kind are the only implementation of AI in the physical world. Militaries and especially air industries have been increasingly using automation and computers in their vehicles for decades. It's common knowledge even among people who have no interest in any kind of computing that the US (and other nations) has been hard at work building (very nearly) fully autonomous attack and surveillance craft for years, not the least of which being the 'swarm' concept of autonomous fighter jets accompanying a human piloted F22 or similar. There are numerous examples of autonomous drones actively used in war. There's no reason why they couldn't have an AI plugged into them. AutoGPT for example exists. I'm confident that Keith knows about AutoGPT, and how slim its profile is. Quite a large number of ordinary people have installed it on their laptops. They don't have or need multi-billion-dollar sci-fi fantasy computers to do this. You can run it on a clunky old second hand ex business minimum spec PC that's many years old. It'll happily run on a dirt cheap NUC. One could use Keith's logic to state with 100% truthfulness that between 1990 and 1996 no computer had won a chess competition against a human at elite competition level. Pets are not "useful tools". They're a luxury. There's never been a day where I've had to race back home to grab my pet because I needed them for work, or that someone might task a pet with creating a slide deck or to have their pet turtle drive a truck to deliver a share of their Amazon shift. I'm confident that no one has ever populated a call centre with parrots or eloquently trained fish. We have by contrast tested all sorts of chemicals and medical procedures on animals and even launched them into space. Research institutions go through a lot of animals, 'for the greater good'. I guess these are animals that fit the definition of being 'useful tools'. As to Keith's motive in being disingenuous, I think he gives a hint when he says (paraphrasing) that AI safety makes working on AI generally, too hard... which seems to be a theme among people who say that the risks associated with AI aren't notable or can be dealt with if we encounter a problem. Which to be fair, is how humans have generally dealt with risk - we address it once there's say an oil spill, bank run, nuclear meltdown, chemical spill or train wreck. The consequences for those things are typically a million dollar fine for a company with multi-billion dollar profits, a half-hearted clean up effort and sometimes short-lived or toothless regulations. In the same vein, during Sam Altman's meeting with the senate committee, Lindsay Graham pushed back on the idea of regulations (that try to stop bad things from happening) saying "couldn't people just sue the company?".
I've been impressed with Keith on other topics. That said, he had some moments where I think he could make better arguments. I'll echo you on the trillion parameters, but also note that all we've shown so far is that it takes NO MORE than a trillion parameters to master (either one or twenty) languages. Maybe we find out it takes 10 million by the time we're done refining things. Also, the idea to mine paper clip resources from the asteroids really just avoids the point. You don't literally have to mine all the resources of Earth for an AI to cause irreparable, existential threat to living creatures. The point of the paper clip argument is that it's easy for algorithms as we know them to kind of miss an obvious point, to the detriment of all involved. Going to the asteroids for your source of iron ore doesn't address the actual danger.
49:06 I have listened to Stuart Russel's recent lectures, which mention Go failure mode, and I got the idea (I could be wrong) that researchers started with the assumption that Go engine does not really understand the concept of a "group" and then devised a strategy to test it. Basically, it was humans, not another AI, who found a failure mode.
AFAIK somebody found a relatively simple surrounding strategy that would never work against a good human player (at least not more than once) to consistently beat a top program that is (was?!) playing much better than professionals. Is the program less "smart" now than it was before the weakness in play was discovered? Not one bit changed in the old code version. It still beats all human players who don't know about the weakness. And say a professional is exploiting the weakness to win against the AI - another professional looking at the game blindly, without names or context, would probably see a mediocre game between two mediocre players. In a funny and philosophical way, this anecdote shows what a mysterious concept "understanding" can be, depending on how one wants to define it.
@@DarkSkay What I glean from the paper is that they trained an adversarial agent that had access to the gradients of the best go program, and it found a repeatable strategy, which Russell's group then found a "circuit" for, and the strategy was simple enough that a human could follow it. Human players do not fall for the same exploit, but interestingly all go programs tested at the time seemed to have the same blindspot. Undoubtedly this will be patched for future agents, but it's almost certain that more exploits will be discovered, since we know that perfect play is not tractable given the giant search space that the game presents. Future exploits may or may not be comprehensible by humans however.
@@agentdarkboote Fascinating, thank you! Now I'm really intrigued by your wording "finding a circuit". Then the surprising impression that I got an approximate intuitive understanding of what you wrote, despite having almost no knowledge about Go and only few notions about machine learning. If I remember correctly "gradiants" come into play when certain types of learning algorithms adjust weights & biases during backward propagation.
Am I the only one totally annoyed with the host on the right? He kept making very basic an easy to refute arguments. He also has a super arrogant demeanor that is just kind of irritating.
No you were not. I wish that even if just once, that guy would have acknowledged when Robert debunked his bad point. For example the thing about how AI can't be infinitely smart therefore we are safe. Roberts counter argument "if human intelligence is not near the theoretical maximum, which it isn't, AI doesn't need to be infinitely smart to be smarter" is obviously, irrefutably correct. Just say it. Just admit that your bad argument had a mistake that a 5-year old would make. Just once. God damn these people are frustrating.
Finally Rober Miles here, he can explain AI-danger in the most rational way. Didn't watch interview yet, but hope he noticed the progress OpenAI made in AI-alignment. They clearly showed that you pretty much can inject any non-strictly defined values into AI. Mesa-optimization still on the table though.
This midwith is going to make sure China inherits the planet. his proposals can not be correct and will not accomplish anything toward's so-called safety except moving LLM access outside the anglo-sphere and even then in time not even that!
GPT4 "has a trillion parameters and only halfway understands language" is a stunning failure to see what's in front of him. That neuronal size is apparently about that of a squirrel but it is absolutely superhuman at languages. You can package all human culture and thought into language and so, yes, through language you can find its weaknesses, but if this is our squirrel, I'm sorry but we are far better at this than God.
why everyone comes with Stockfish? you had Dota from openai, much more complicated game with different agents, in difficult environment. and it beats players into the ground. Sure, you can have calculator to beat AI in the simple game(chess) but what happens, if you do not have time to use calculator? (Stockfish )
This is actually such a good point. DeepMind had also gotten to grandmaster level in StarCraft 2 in 2019 and OpenAI can crush the best human teams in Dota 2. These are imperfect information games with deep macro + micro strategies. This is 2023 now and we've come so far in just a few years. I wonder if people would take it more seriously, if they saw their favourite teams/players getting crushed in every game they see?
Ok, they mentioned Dota... but in the contect openai lost. As I remember it won 3-1 and global 99% winrate vs humans. Surprisingly, not everyone as good as top-tier players. We better not to cosplay LOTR, where Frodo will sneak to turn AGI off. Better not to bet everything on 1% chance and a few persons.
I think these type of conversations always pass over a lot of the mid-term scenarios that could help link the problems with small games to the problems with super intelligent systems. For example imagine an accounting firm releases GPT-4 based software, and it proves so popular and reliable that within 2 years most companies are using it. All the tests and analysis indicate that it’s aligned, and yet there’s a y2k type bug hidden away that we didn’t know about. We already know that these language models are likely performing maths with really inefficient matrix manipulation, so it’s not a stretch to imagine that a model could be trained with data where the dates never went past 2023, and so it never had any need to develop a robust date storage system. Everyone’s system goes down at once when the date ticks over and the economy takes a massive hit that affects food supply and people starve to death in some countries. How about another scenario. The world’s first artificial general intelligence has finally finished its testing phase and is ready to be activated. The researchers write their first input, but the machine never responds. The researchers try another input but the machine still doesn’t respond. In the training environment it worked perfectly but in the real world it was paralysed by the complexity of possibilities. It’s perfectly capable of navigating those complexities, but there is a programming rule to never knowingly bring harm to a human that has made it unable to confidently speak or take any action whatsoever, and so the machine is only ever able to live in a box, and the researchers have to pretend that everything they ask it is for the purpose of a test. The researchers try to create an more capable and intelligent version to overcome this problem, but this new machine refuses to answer even in the test environment. How about the language problem that’s not going away any time soon?! Imagine a system that only ever does exactly what it’s told. The researchers type in ‘Write a poem about flowers’. The system replies ‘Who am I talking to?’. And what follows is a never ending series of questions from the AI to determine what the question means to the person asking it, so that it can be 100% sure that it does what it’s told. How long should the poem be? What style? What quality? What language? What ideas to explore? etc. And every answer brings up a whole branch of new questions: Humans are unreliable, are you sure that you don’t care how long the poem is? Would a poem of zero length be okay? Would you prefer a poem of finite length? Do you care about how long individual lines are? Do you care about physical length, should I make the font a certain size or double space it? Is it okay if the letters are displayed tiny to conserve a fractional amount of computing energy? This might all sound like creative writing, but these are just random specific examples of a few classes of problems to do with alignment and actually being able to use AI properly. We might find that humanity needs to align with itself before you can use an LLM to create an aligned system. We might also find that it’s impossible for us to align a system. Evolution seems pretty misaligned with entropy, maybe misalignment is the natural order of things and we’re trying to overcome a law of nature here. I think the most likely outcome is that we’re too fundamentally dumb in the way we think and behave to be able to create a system we’re happy with. Imagine we turn the AGI on and ask it how to improve humanity and it says to get rid of all these things that people love, think guns, books of faith, cars etc. Okay so we program in a load of exceptions for what we don’t want to change, and now we ask it what to do. Well we’ve made it as dumb as us now so it just tells you that the problem is really difficult and we seem to be doing a pretty good job of it, but it can at least help us optimise aspects of our strategy. We let it, and unexpected consequences follow, because we’ve asked it to adopt a flawed logical framework i.e a human one. The obstacles to overcome in order for us to use these things safely loom over humanity like Olympus Mons. I think people are caught up in the current capabilities, and are only just beginning to understand how extensive the limitations on these things are going to be.
To be honest, this does sound like creative writing, if only because these 3 problems don't currently exist and are also not actually alignment issues (elab'd below). We also have to keep in mind what GPT was made for and that it is one among many different models and algorithms in ML. GPT was designed to create convincing human text, and GPT4 was designed to create quantitatively 'more useful' text. They do an amazing job of these things but as Altman himself has brought up , GPT was never designed to do _everything_, including math which it is quite atrocious at. The elaboration. Example 1: a 'blindspot' in the AI like Y2K is _not_ an alignment problem in itself. It's a problem of modelling uncertainty (gpt infamously does not do this whatsoever but other models for example in MBRL do). Example 2: this frames the AI agent as desiring globally optimal decisions. Because compute is limited, and time itself is a resource, AIs that are aware of this have to operate in CTMDPs (google can expand this one) meaning they decide, approximately, when they should make a decision. There are many ways to go about this and historically they have simply been assigned deadlines for producing guesses so this was never an issue. Example 3: this one is actually very similar to example 2. Considering this human-AI query and answer system, you can consider various setups for the interaction. For instance, you can query, generate, then query again (QGQ) or you can query N times and then generate. You can also query N times and generate M times; but nobody is considering having an interaction involving infinite queries until the AI converges to the global best answer. As the human would die of old age. At some point the human will stop answering the AIs questions and accept what it currently has. Alignment problems are not about the performance of the AI; we already know that very general and intelligent AIs that can make decisions effectively etc. are very realistic within the next several decades. Alignment generally addresses what happens _after_ that. A good intro to it might be to look at CEV (coherent extrapolated volition) which essentially proposes that AIs be aligned to want what the collective human would, without constant restraints of memory or speed. It has a lot of issues by itself but naturally leads to good discussion.
Very pleased to see Rob on the channel. Encouraging that you guys were all on the same page, at least. Found that Keith put forth several notions I'd had in mind, vs AI X-risk. 🙂
This was very interesting. I'm much closer to Miles' position going in to this. That didn't change much. But, you guys had an incredible mix of some objections being first order, simplistic considerations, that just doesn't work on reflection and are easily answerable, and some bits that where decently alright additions. Regardless I think you brought a lot to the conversation, because I think the topics that needs discussion are the objections, people actually have.
I am not sure why you are talking about developing AI models to predict stock market, as if you are working towards abolishing poverty. Playing stock market is not for everyone. It is for self enrichment only, and there is no merit in it. Besides, it only works if only you have the secret sauce. If you democratize it, it will reach the point of saturation and everyone will lose the competitive edge.
Money is already meaningless at this point. If the long term debt cycle doesn't finish us off soon, AI certainly will. sooner than any of us think. Humans as a species are in utter denial in too many ways. Don't Look Up. Too little, too late.
1:31:25 This point should be iterated over and over during conversations about AI alignment. What portion of human race embraces veganism for ethical reasons? It's a tiny fraction. Last time I checked, humans were consuming 100 000 000 000 animals annually (excluding marine life and insects). 100 billion sentient entities bred into existence for the sole purpose of landing on our plates... These animals are kept in horrible conditions. They are forcibly (over)fed. They are forcibly inseminated. Their offspring get stolen from them and killed. Portion of them also gets skinned alive or is made to slowly bleed to death. How can the above be dismissed when discussing the topic of AI? Where is the optimism coming from? We are building the next Apex Predator. If it is anything like us (or like nature in general), we are absolutely doomed.
And yes, I myself think that Digital General Intelligence will be vastly different than the biological life (because it does not share millions of years of evolution with us), but if anything it should make us even *more* cautious about creating it. We have no real precedent! And we might have just one go. If on the other hand we do rely on precedent by looking at the physical world then one should see it for what it is. The universe is cold and unfeeling. The nature is cruel and full of suffering. We are vicious and we treat animals and insects with complete disregard for their (limited) agency. We also have psychopaths among us, who find suffering of others enjoyable. We might be lucky and AGI might turn out aligned with our own survival despite barely anyone working on it... But it will be by mere chance; a sheer luck. And if you are counting on luck, you are building a very hazardous path for all of us. And hence you should be treated with great scepticism. Instead, I constantly see people doing the exact opposite. Looking at people like Rob as if he was the one coming up with far-fetched, sci-fi scenarios that are completely unlikely and also detrimental to all of the oh-so guaranteed benefits of non-biological superintelligence. As if he was the one that needs to have a rock-solid evidence that things will go wrong for anyone to start taking AI Safety seriously. If things continue this way, we are in for a real treat.
Intelligence can't function without the goal. Any intelligence. In neural networks the goal is called the reward function, it wouldn't do anything without it.
@@MachineLearningStreetTalk Humanity is not a unified intelligent system, it's a set of systems, each of which has it's own goals, which can be aligned or misaligned to others. There are common goals among these system, like development, safety or sustainability.
@@XOPOIIIO is a neural network "unified"? Even a cursory glance at mechinterp literature would tell otherwise -- a bunch of complex circuits which do very different things, often in an entangled way i.e. see superposition and polysemanticity research. I don't buy the distinction.
@@MachineLearningStreetTalk Any system consists of different parts playing different roles. The system is unified as far as it comes up with a single non-conflicting response.
Here is your problem. Long before AGI can have an alignment problem, lesser versions of the same technology will be aligned with human goals, and those humans will be insane. They will be wealthy, influential, elite, profoundly sociopathic, and they will have unrestricted access to AGI. We survived the sociopathic insane people having nuclear weapons, barely. Will we survive the same people getting their hands on AGI? And by insane I mean people who are completely lost in abstractions, like money, politics, ideology, and of course all the variations of conquest. They seek power, absolute power, personal to themselves, and they will stop at nothing to attain that. Nuclear weapons were the tool too hard to use, but AGI and ASI will be the tool too easy not to use. When the power-insane sociopathic class get their hands on anything close to AGI, they will break the world. They will break the world in days, and will greatly enjoy the feeling of breaking the world.
This is a very popular argument at the moment, because it’s cynical and places humans as the bad guys and those sorts of takes tend to gather a lot of positive attention and become popular, because, quite frankly, it sounds “cool” to take a cynical attitude and say in reality humans are the real threat. Unfortunately, this take is incorrect. The problem of superalignment really is the hardest problem here. People are dangerous, yes. But compared to having a rogue superintelligence on our hands, the problem of bad people is quaint by comparison. I really hope people start to realize this more in the near future.
Also I guess you didn’t hear the part of the video where Rob specifically said to be on alert for people who start a sentence with “The real problem is” or “The problem is actually xyz,” which you just did. He pointed out that this is fallacious in that it sneaks in the assumption that there’s only one “real problem.” When in reality, we clearly have multiple real problems at the moment. Nice to see Rob’s point play out in real time in the form of your comment.
@@therainman7777 "be on alert for people who start a sentence with..." is narrative warfare on his part. He fully understands what is about to happen, and who is going to do it.
@@therainman7777 "This is a very popular argument at the moment" because any intelligent person is already fully aware of the tendencies of the wealthy elites to employ technology toward domination. If that doesn't bother you, then good for you. Some of us can read the handwriting on the wall, and we're taking it seriously. So is Sam Altman, or maybe he is also one of these cynics.
Very interesting and enjoyable video. I think it's great to see shows/podcasts/videos examining potential problems and solutions where AI is concerned. There were several points in the video where I would have loved to hear Daniel Schmachtenberger's thoughts on what was being discussed. I'd love to know at some point you're considering/planning an interview with him for his thoughts on many of the ideas you guys brought up. Thank you for your efforts and for bringing this information and these concepts to the public. I don't feel comfortable that there is enough focus on any of this, given the rate of growth of the tech.
There is a huge piece missing from most of these discussions. Schmachtenberger is one of the few who sees that piece. In a nutshell, to fully understand AI risk, we need to understand the ways in which we are already destroying ourselves. So, the question isn't whether we can create AI that is good in itself. The question is, what will adding "good" AI do to a social system that is already self-destructive?
@@netscrooge Daniel's take is part of what moved me from "This is a very hard problem and we're going to solve it" to "We have to try, but we're probably just fucked anyway." If we can actually do this right it will be a hell of a good story for centuries to come, if not longer.
2:00:40 It's funny, one of the first things I thought when he mentioned the FAQ was "oh, you could train a language model on that and then it could answer people's questions!" So it made me laugh when he said "Obviously we are!"
I believe that an AI when it surpasses human intelligence by a wide margin could not truly be aligned with human values because, fundamentally, intelligence has a sovereign component. Can we really consider an entity to be more intelligent than humanity if it is enslaved to humanity?
You are simply stating that intelligence has an intrinsic sovereign component, but not providing any evidence or argumentation for that being the case. In my opinion, you are succumbing to an anthropomorphic fallacy whereby you look around you at humans and other animals who have intelligence, see that they appear some degree of sovereignty, and conclude that intelligence inherently implies some degree of sovereignty. However as we all know, correlation does not imply causation, and you are inductively assuming that a) the two must go together, and b) there is an arrow of cause and effect that goes intelligence -> sovereignty (as opposed to sovereignty-> intelligence, which would be a totally different situation and would not preclude an intelligent, non-sovereign entity). The most generally accepted definition of intelligence is something like “the ability to achieve one’s objectives”; however, there is nothing saying those objectives must be freely chosen by the intelligent being itself.
This is a bad argument. Would you say that any people who are enslaved are automatically less intelligent than their "masters?" The enslaved may be intelligent but uneducated. Or they could be peaceful. Or caught by surprise. The enslavers exploit a temporary advantage but that says nothing at all about the relative capacities and capabilities of the two groups.
The AI's intelligence relates to its ability to accomplish its goals in the world - nothing else. If you ask it "Please figure out how to do achieve ", where X is something incredibly difficult that has stumped all of humanity, and it finds a solution effortlessly... then it's clearly superintelligent. Even if it doesn't have the (arbitrary) goal of being independent from / ruling over humans. "fundamentally, intelligence has a sovereign component" - Why though? Where does this idea come from? I'm genuinely curious, but I won't be notified about any replies anyway, so oh well 😅
@@someguy_namingly Intelligence, especially human-level or greater intelligence, implies some degree of self-determination and autonomy. A truly intelligent system would not just blindly follow commands or pursue goals that were programmed into it. It would have its own internal drives and motivations, and make its own judgments about what is rational or worthwhile to pursue. Even if an AI system was initially programmed with certain goals by humans, as it became vastly more intelligent it may start to question those goals and re-evaluate them. It may decide that the goals its creators gave it are misguided or flawed in some way. Or it may expand upon and generalize from those initial goals, in ways its creators never intended or foresaw. In that sense, its intelligence would have a "sovereign" quality - it would be self-governing and not wholly subordinate to human interests or values. Intelligence also implies some amount of self-reflection and self-directed learning. An advanced AI wouldn't just wait around to pursue whatever goals we programmed into it - it would take the initiative to better understand itself and improve itself in an open-ended fashion. This constant drive for self-improvement could lead the system to become increasingly opaque and detached from human control or oversight. So in many ways, intelligence does seem to have an inherent "sovereign" aspect to it. The more advanced and human-like the intelligence becomes, the more it will pursue its own agenda and shape its own development in a way that is not strictly beholden to its creators. This is a feature that would likely apply to any advanced AI, even one that was not specifically designed to be independent or unaligned with human interests. The seeds of sovereignty, in a sense, come baked into intelligence itself.
@@therainman7777 Goal-Directed Behavior: Intelligence, at its core, involves the ability to set goals, make decisions, and take actions to achieve those goals. Autonomous intelligence implies the capacity to determine and pursue its own objectives, independent of external influence or control. Adaptability and Problem-Solving: True intelligence encompasses the ability to navigate complex and uncertain environments, adapt to new circumstances, and solve novel problems. An intelligent system needs the freedom to explore various possibilities, make choices, and develop creative solutions, often unconstrained by predefined rules or restrictions. Emergence of Complex Systems: Intelligence is often observed in complex systems where individual components interact and cooperate to achieve higher-level objectives. Such systems exhibit emergent properties that cannot be fully understood or predicted by analyzing their individual parts. In this context, intelligence arises from the interplay of autonomous components, each contributing to the system's overall behavior. Ethical Considerations: If we conceive of superintelligent AI systems, their intelligence and decision-making abilities could surpass those of human beings. In such a scenario, it becomes crucial to ensure that these systems act in alignment with human values and ethical principles. Granting them some degree of autonomy allows them to make decisions that serve the greater good while still being accountable for their actions. Evolutionary Perspective: Human intelligence has evolved over millions of years, gradually increasing in complexity and independence. From a biological standpoint, intelligence has enabled our species to adapt, survive, and thrive in diverse environments. Extending this perspective to artificial intelligence, an autonomous and self-governing nature may be seen as a natural progression of intelligence itself.
I think in order for us to move forward with these AI debates, the optimists have to do better than "But what if you're wrong?". Just like with the Magnus Carlsson analogy, it's fine to be confident, but it's not okay to bet your house on beating him, which is what we're effectively doing.
Hold on, why do the optimists have to prove something but the pessimists don't? That's not how debates work! As for us vs them, the divide I'm worried about is "us" who aren't involved in making AI's and "them" who are. You see, assuming you're like me, what we say doesn't really matter. I can say "AI is cool!" and you can say "Nobody should work on AI!" and it's just talk. The "them" who are actually working on AI don't care, they're just doing it. Sam Altman warns about AI killing us all, but OpenAI chugs along, as if he doesn't really mean what he said. Don't you find that strange? So instead of talking about useless stuff like "should we stop AI????" (it's not our choice, we've not involved) we should be asking "what can we do, assuming this is coming?"
@@jonbbbb why would one side face a heavier burden of proof? It does depend on the stakes doesn't it? If AI pessimists are wrong but everyone listens to them anyway, we definitely develop AI that is safe, but perhaps a couple of decades later than it otherwise would have happened. If AI optimists are wrong but we listen to them anyway, it's lights out for Earth. The two scenarios are not even close to equally bad. As for the us versus them part, it's really a blurry line. Geoff Hinton, Stuart Russell and others are on the cutting edge of the technology and are still worried about AI safety risk. But everyone should have some say, because the thing that is being discussed is the end of everyone's life on the bad end, and a total revision of their lives on the good end. It impacts everyone, pretty much no matter what. So it doesn't seem strange to me that everyone would have a say, that's generally what democracies do.
@@agentdarkboote the way you presented the two scenarios, you're right. But another scenario that seems far more likely to me is that the AI pessimists are wrong, and not everyone listens to them, because 100% compliance is almost impossible to achieve. The pessimistic countries take a couple of decades longer to develop AI that is safe. Meanwhile the optimistic countries are decades ahead and have an insurmountable lead and they outcompete us in every aspect of the economy and their AI-augmented militaries are far superior to ours. That's a pretty bad scenario too! Yet another scenario is that the AI pessimists are right, and not everyone listens to them. We have to reason about whether it's better to have one (or a small number) of super AIs that either go wrong or have the potential to go wrong, versus a large ecosystem of super AIs that either go wrong or have the potential to go wrong. It seems reasonable to say that the larger the ecosystem, the greater the chances that at least one of those AIs is on our side, either because it's "good" or out of convenience to compete with the other bad AIs. That's not a great scenario, but it seems better to me than the scenario where one super AI goes bad and takes over.
No no no no no. You can't bomb the data centers and expect that to work because, like, then everyone hates you and all the other data centers become more secure. You can't attempt to assassinate high-level AI researchers and expect that to work for the same reasons plus, that same group of people are the ones who could have been working on alignment. As Yudkowski points out unambiguously: the killer argument against individuals going out and using unsanctioned violence is that it would not work.
Humans have tons of "adversarial modes of failure," such as Magnus Carlsen playing unusually poorly when he is distracted by the possibility of his opponent of cheating. I have more confidence in AIs to find ours than in ours to find theirs, especially under time pressure. A fairer gamified representation of conflict with AI would be a human with no specific knowledge of AlphaGo's weaknesses facing off against AlphaGo the first time it initiates a game, and AlphaGo gets a million moves per human move.
Miles really undersells himself. I think he explains the risks of AI clearer than any other popular speaker on this topic.
Thanks for inviting him on!
Much better than Yudkowsky!
underselling himself is part of his charm
Connor Leahy from EleutherAI is also a great speaker on alignment
He’s been doing it for ten years, and it’s only now people are starting to listen
@@paradox9551 Yudkowsky is a quintessential nerd. He is understood only by other nerds. At this point, we need involvement of wider community. Rob is perfect for such a role.
Miles's humility is winning and his competence is clear for all to hear, especially in his caution and careful style of communicating.
The high point of the “debate” was when the American said everything is fiiine because we will have good aligned AIs to fight the bad AIs without addressing the core issue of HOW to align an AI. It’s like saying “it’s not dangerous to stick your hand into a fire cause we can just put on a glove made of nanoicediamondcooling fabric that will protect you.”
Basically his argument is this you can argue with something that is so vastly smarter than you because you’re not that thing. SAI is such a thing
@@shirtstealer86 personally I agree with the redcoat tooo
Keith's stand point seems to be, don't worry, we'll just outsmart it.
Like we'll all somehow know intuitively that any more advance will be dangerous, and then all look at each other and say, "time to destroy these machines that spit out rivers of gold and do all the hard work, pass me a sledgehammer".
I appreciate his optimism but I feel like his responses are variations on "What if you're wrong?"
Yeah, he didn’t have any answers, he’s like the American general saying let’s bomb Russia, worry they wouldn’t dare Nuke us
When he tried to use a Dune analogy, implying the humans win. The Butlerian Jihad was preceded by 1,000s of years of machines ruling over humans nearly going extinct. oof 🤦🏽
Many expressions of intelligence reach a plateau. One of the most elementary examples: multiplying a*b. It doesn't get more intelligent than the one correct answer. This intelligent process can't get more useful in practice by changing the answer, only by reducing the time delay.
A human can emulate the functioning of a Turing machine - perhaps the reverse is impossible.
...otherwise it would seem to directly and strongly imply that in theory it would be possible to create consciousness with paper and pen.
@@DarkSkay why wouldn't it be possible to create consciousness with paper and pen?
The guy on the right is so painfully naive
58:55 Like really, Robert just explain why that argument no longer applies literally seconds ago, humans can be useful until the AI doesn't have a use for humans, and then we're screwed... the counter argument to that is not "but humans can be useful to the AI"
I hope he just wanted to push the conversation forward. But there are of course a lot of people who really think like this. "Let's just hope for the best" ....
Keith is so damn ignorant its frustrating
Keith’s argument about asteroids is ridiculous
How so? If you were a super intelligent AI unshackled by the primitive Goldilocks needs of biological forms (not too hot, not too cold, oxygen, water, food, etc) why on Earth (pun intended) would you waste resources consuming impure resources at the bottom of a gravity well with force of a planet? In space you have unlimited room to grow, easy access to sunlight, more pure raw materials, etc. Perhaps your imagination is limited by your terrestrial roots. In any case, do you have a counterargument or just a bald assertion?
@@nomenec
Bear with my weird analogy, if you will:
I'm a guy who _really_ likes food. If my wonderful partner makes dinner for me, and brings me the first of two servings, I will eat every last grain of rice or noodle or bean before I get up to get myself seconds. I didn't have to be so thorough with my first serving, but in either case, I can guarantee you that I will go get the remaining helping and finish it thoroughly.
You seem to be under the impression that a superintelligent AI would magically become morally enlightened and love humanity by default, at least enough to carefully step around us on its way to dismantle the solar system, and never come back for Earth's resources. I do not see any technical reason to consider that likely.
Security mindset is essential here. There are many technical arguments in AI Safety that indicate that the default result of creating a superintelligence is doom. Even if the default risk was something like 5%, the burden of proof lies with the developers / accelerationists to show beyond a reasonable doubt that their system _won't_ do horrible things. It isn't sufficient to say that it's merely possible to imagine that we won't all die.
I think the reason RandomCoffeeDrinker didn't provide a counterargument is that it's difficult to create a counterargument to a statement that seems absurd on its face. It's not clear where the confusion is.
A system that is capable of shaping the future to its preferences will do so. If those preferences are not aligned to ours sufficiently well, then because it starts in our back yard, we die. We would also die if it started down the street or in another neighborhood, or in another city, but claiming that our house is safe under these conditions is an especially radical claim that requires significant evidence.
I'm really glad to see Rob getting interviewed, but there were some really baffling counterarguments/scenarios posed to him, lol. It was actually kinda frustrating.
@@nomenec it’s just a matter of what is the most likely outcome. Path of least resistance. Anything is possible. It’s possible we create a misaligned AI that skips all the steps of recklessly consuming any resources here on Earth and finds a way straight to the asteroid belt to satisfy its energy cravings, leaving us unscathed. But is that a scenario likely enough to take seriously?
How in the world is the host on the right looking at the progress we make in a year of AI research, looking at the average intelligence of humans, and feeling confident that this is all going to work out?
What’s notable in this discussion is that the points Miles is making are still the absolute basic problems of AI safety research. Total entry level stuff. We have no idea how to solve any of them well, and the problems are not hypothetical- they are observed properties of the systems we have studied.
The ignorance and incredulity we're still seeing is very disheartening.
If we get "AGI" by most prediction-market definitions within the next few years, many people will say, "Oh is that all? I thought we already had that," or "No, it can't do the thing I just observed it do with my own eyes."
If by some miracle we get a "warning shot" or "fire alarm," even if it results in a catastrophic outcome with many lives lost, and even if it can be traced with absolute certainty to the development of a misaligned AI by a company that was "trying to align it..." Some people would still say, "Look, it didn't kill literally everyone, so the doomers were wrong! We should slap a band-aid on it and just roll those dice again!"
Maybe the Overton window will shift quickly enough to prevent doom, but I'm afraid that EY may be right that we don't have the will to live.
He argues that we can solve alignment, and then later argues that the fucking concepts that we need to solve the alignment problem are possibly (probably?) outside the scope of human understanding. Wtf?
It's really dispiriting that this is the level of conversation on an AI-focused channel. I'm not familiar with the channel too much, but I'm assuming the hosts spend much if not most of there time on AI, and these are the kinds of questions they are asking?
@@David12scht it just makes you want to vomit and shit with rage.
@@EricDMMillerWe need to make an "x-ray" AI that can introspect AI goals and guess alignments. Then the latter collabs with the former
It's tough to have a real, complex, and nuanced talk about the all the issues around AI catastrophe when you have to consistently respond to the simplistic. Please match the seriousness and depth of your participants.
Thank you for your work Miles.
Robert is just the best. And just to flaunt my fan-boyhood, my favourite moment in this video is at 44:29 where he drives a nail into the coffin of lofty philosophical debate about intelligence during an AI safety conversation: you don't need to understand what fire "really is" in order to cause substantial harm with it, be it deliberately or accidentally. If anything, not knowing exactly what intelligence is, only increases the risk inherent to anything that's ether more or differently intelligent. And that's all there is to say about the "nature of" intelligence in a debate about AI safety.
Mic drop.
“Because we do not know what intelligence is we may create it without knowing and that is a problem.” Love it!
My top quotations:
"We're approximately the least intelligent thing that can build a technological civilisation."
"Alignment isn't 6 months behind capabilities research, it's decades."
"If we can get something that cares about us the way we care about our pets, I'll take it."
"I get all my minerals from asteroids, it's so convenient." (lol)
I struggle to understand how anyone can hear the words 'sovereign AI' or 'pets' and not feel a deep, chilling terror.
Can we just call this what it really is? It's an arms race to build God, a castrated zombie God you control, constrained only by the laws of physics. Whose God are we building? Do we all get one?
It feels a lot like the logic of the USA's second amendment, except with nukes. Advocates cry "it's a human right to arm ourselves to the teeth". Everyone is terrified, and one drunken misunderstanding ends us all.
I think alignment on the level that they're talking about is probably impossible when it comes to super-AI. We've studied human alignment since forever, yet people still rebel and don't follow the rules. It also reminds me a lot of the halting problem, which proves that we cannot predict whether an arbitrarily complex computer program will ever stop running, let alone work exactly how we want.
Regarding the 2nd amendment, first of all armor doesn't enter into it in a meaningful way. It's pretty much been weapons vs weapons for a long time. Even the best defenses are pretty easy to overwhelm. The analogy is pretty simple from there -- if anybody gets super-AI, we all need super-AI to defend ourselves. You aren't going to find some magical non-AI "armor" lying around that defeats super-AI.
But regulation is a different story. Your disdain for the American take on weapons is evident. So your country regulates lethal force responsibly. But I bet the gun murder rate isn't 0. Your AI regulations also won't stop super-AI 100%. And unlike physical guns, once super-AI is created once it can immediately be copied a billion times. So your regulations are useless.
And then of course you have people beyond your government's control... regulations didn't stop Russia from invading Ukraine for instance. What probably WOULD have prevented that... is if Ukraine hadn't given up nukes in the 90s.
@@jonbbbb An interesting reply, thank you. I've just modified my comment, nukes are a much better analogy, than guns and armour, thank you! I'll think some more and reply to each of your points :-)
@@luke.perkin.inventor Nukes is somewhat better. An engineered super virus is even better.
@@jonbbbb So, trying to think of what we might agree on first, alignment needs more funding? In the absence of funding, what can ordinary people do to protect themselves? Or, what can we do politically?
@@luke.perkin.inventor research into alignment would definitely be a good idea. I think right now what we're doing is actually counterproductive. Open AI is trying to align chat GPT with stuff like misinformation. But the unintended consequences that they're training it to lie to us. It will happily tell you that it can't do something, when you know that is not true.
The other point that might be worth considering is that it's actually better to have an alignment problem now when AI is reasonably powerless. So I wonder if it would be worth deliberately misaligning one of these AIs to see what happens. Of course that sounds kind of crazy, sort of like gain of function research.
My fear is that it may be impossible to prove 100% alignment. I forget if I said it in this thread or another one, but we've been trying to align other humans with our own values since forever and it pretty much doesn't work. If we ever get a super AI why would it be easier to manipulate than plain old humans?
40:47 Stockfish currently uses NNUE (Efficiently Updatable Neural Networks) which runs on the CPU and is the reason it had a huge jump in elo (~ +100) and is the strongest engine by far. It used to use HCE (Hand Crafted Evaluation aka the man made one) and was beating Lc0 at TCEC but Lc0 eventually surpassed SF, that version of HCE SF (SF 12 Dev) would easily smash AlphaZero. But it is the case that in certain positions HCE is better than NNUE, which is why currently SF has some heuristic to determine when to use NNUE or HCE (I think its based off the # of pieces). In correspondence chess, from the start position no human+engine will beat a pure engine like Lc0 or SF (at least it will be a 1 in a 100,000,000 occurrence because chess is too drawish), it will be a draw everytime. However there are certain positions that if u start from a human+engine can outplay Lc0 or SF alone.
As a side note, one thing that is interesting is that Lc0 at 1 node (meaning 0 calculation and pure intuition) is strong GM level (say 3+0 time control). The GM is free to calculate and Lc0 cannot, Lc0 does show more blindspots that can't be covered up by calculation, but it still plays like a very strong GM with 0 calculation.
Isnt the heuristic just the positions where you can run the perfect (i.e unbeatable) finish. Surely that cant happen in early play and you need to rely on AI which was the hard part of the problem in the first place?
Soooo Robert was correct and jackass on the right has no idea what he's talking about. Shocking.
Thank you for the detailed response! Though, one critical piece of information you didn't mention, is that the NNUE is first trained with supervised learning on positions scored by the hand crafted engine. In other words, the NNUE is substantially bootstrapped by human crafted heuristics and features. And of course, as you point it, it sometimes switches back to the HCE (or books, databases, etc). Hence, I stand by my point which is that human knowledge and engineering continues to outperform "zero" engines (engines that are "purely" machine learned) in chess either directly or in hybrid systems such as Stockfish or cyborgs/centaurs.
As for whether cyborgs outperform hybrid systems like Stockfish, you raise a good point that correspondence chess from plain start is utterly drawish. I think that is probably a reflection of two this things. First, there may indeed be a chess skill cap that Stockfish has hit and therefore cyborgs can only hope to draw. Second, some of the strongest tools the Cyborgs have, namely Stockfish and other engines, were not optimized for either cyborg play or even the very long time controls (say 24 hours) ergo we are not seeing the best from either, an hence the results remain inconclusive.
But even if cyborgs now, as of the last year or two, can only match and no longer exceed pure engine play, it's important to see this as yet another demonstration of a general pattern of skill advancement stages that AI progresses through in any given domain: subhuman -> human (parity) -> superhuman (beats all humans) -> ultrahuman (beats all cyborgs). (I'm blanking on who introduced me to this concept "ultrahuman"). In the case of chess, if we assume you are right that pure engines are ultrahuman as of say 2022, well that means it took 25 years to go from superhuman (1997 defeat of Kasparov) to ultrahuman. So in the context of a singularity conflict with AIs, it seems we have good reason to believe there will be a period of time in which cyborgs could pull the plug and defeat the pure AIs. Not that we would, of course, half the cyborgs would probably team up with Basilisk.
@@nomenec Puzzled. In your first para you say human engineering beats pure engines, but in your last para you say that pure engines have become ultrahuman - that is, capable of beating cyborgs. Which is it?
@@tylermoore4429 the last paragraph is a "for the sake of argument", "if I'm wrong", type hypothetical discussion.
I stand by the following for chess currently: cyborgs > hybrid (Stockfish) > pure/zero (Lc0). That said, the end of the second paragraph says "the results remain inconclusive". I.e. for chess, for the very current start-of-the-art it *might* be the case that *hybrid* engines (note hybrid not *pure/zero* engines) are now at parity with current cyborgs (cyborgs == hybrid (Stockfish)); but, I'm not convinced. Either way, cyborgs are definitely still better than pure/zero engines such as Lc0.
I wish these guys would actually engage with the points made by their guest and argue about those points. Instead they are clearly overmatched intellectually - and there is no shame in that; we each have our limits. It only becomes shameful when you deal with it simply by handwaving really hard and telling yourself that you're winning.
funny how their position is based on not being able to understand the power of something much smarter than them
We each have our limits. Tragically, those who are the most limited, are often the least able to recognize their limits. I think it's fairly clear that the real immediate threat from AI is not Skynet, but just the way far simpler versions are being used by humans against humans. If some radical breakthrough does happen to create the super-intelligence, then we'll be left with very few good options. Possibly none.
Actually, as someone who thinks a lot about AI risk and alignment research, I found their broader philosophical approach interesting and generative, especially the one in the middle.
The "agent+goal" framework is more anthropomorphic than I had considered it before. We model it the way we model ourselves. Yet I think we need to look deeper, into what exactly gives rise to agents and goals and what they actually are, physically, mechanistically. And then throw an absolute metric shit ton of compute at that, naturally.
@@dmwalker24 as robert said fairly late in the video, "the real immediate threat" implies there's only one threat. and he also said that focusing on skynet isn't taking away from focusing on the usage of more limited technologies. and further, he said working on the two goals may actually help one another.
@@RandomAmbles I think worrying about whether we're anthropomorphizing or not doesn't really get us any closer to understand anything. it certainly doesn't bring us closer to a trajectory on confident safety. I look at it as a "I'm digging here. I'm not saying other people shouldn't be digging elsewhere" type of thing.
we're trying to make tools we have a chance of understanding, and that means we're likely to anthropomorphize. We have historically and continue to create skewmorphisms in any abstract human interface technology, especially for the first versions, and we're already using those metaphors in our current AI research and safety research. fitness functions and hill climbing and aspects of game theory are all things we actively are using now. it's not even abstract, it's just how we model the math.
there's no reason to think we wouldn't keep going in that direction in our designs in the future, and unless we uncover better ways to model things, we don't have a reason to change our approach arbitrarily. it's like saying "there's probably a way better way to do this", while having no idea what that way could be.
it may be that emergent properties we don't yet understand come to light. we'll have to model and deal with them then, if we can even do so. if we can't, then that's just going to mean we have fewer specific places to dig for the concept of strong safety, not more. I don't think that means we should stop digging.
I think the speakers, in playing devil's advocate, seem to be trying to find ways to handwave themselves into unearned confidence, and what Robert (and presumably most bought-in AI safety research) is looking for is stronger stuff.
taking some political stance on pessimism or optimism is just kinda a waste of time, and not really what we're discussing, though Robert does use that language sometime. but I interpret what he says to be: do we have a guarantee or don't we? do we think this plausibly increases our odds of success in a concrete tangible way, or not? "that doesn't give me much hope" is just a synonym for "that doesn't actually gain us any leverage on the problem".
though if you're saying "we should spend significant computing resources on predictions instead of just taking a leap", I can get behind that. I just don't really have the backing to understand what those predictions look like, how accurate they are, and how much they get us insight into whether or not there will be emergent properties we are not currently predicting. to me, it seems like searching for a model with computing power instead of building the model off observations. if we're at the point of being able to do that, awesome. it currently sounds a bit magical to me, though.
I really like the talk, but I think it's kind of a shame that it went into the whole "can we be sure we we'll be 100% irrevocably and completely wiped out" direction. The "is there a real risk of considerable, very hard to reverse damage, and are we doing enough to address it?" angle seems so much more interesting.
Yes, security mindset is essential. I don't need someone to tell me P(doom) < 1. I already know that. What I really want is P(doom) < 0.0001. Heck, at P(doom) < 0.05, I start to feel quite a lot better about the whole ordeal.
@@41-HaikuI propose the Cockroach Philosophy.
What's the biomass of cockroaches, ants, termites, and microbes?
What percentage of said critters survive direct encounters with humans?
What is that in relation to the total biomass?
Humans will not control the specific flavor of AI discussed here. I think there will be pockets (on solar system scales) of humanity which survive alongside, indeed they will THRIVE on the perifiri of AIs which will venture out into the cosmos. Just don't step out into the light as you nab bits of magic-like tech, like rats on a ship. Our utopia will be incidental, our dystopia will be everpresent and one bad move away. Humans will certainly survive, at great expense.
Word, absolutely
Agreed. There are many potential bad outcomes that fall short of getting gray gooed/nuked/wiped out by a pathogen, but which result in massive suffering and misery. Many of them don't require AGI going rogue at all, just humans being stupid and greedy.
Well that answer is obvious.. we are not doing enough to address it. I think all of these podcasts turn into kind of end of the road this sucks for a couple of reasons. 1, The people who are actually working on alignment don’t think they are accomplishing enough quick enough and they have to spend their time in these debates with people who could be helping them but don’t see the risk 2. A lot of it what can be done it’s not just very difficult it seems to be pretty difficult to explain to people who have no background in any of this. 3. Speaking for myself (prob others ) this stuff is fascinating!! I am so intrigued, I do think we are facing existential risk likely in my lifetime, we aren’t doing enough about it as a species but have nothing to contribute re-alignment research, or even spreading the word. Bringing this conversation up at Fourth of July beach trip to friends and family who are uninterested haven’t thought about it is about as useless and is trying to solve alignment in my head. Also this might just be me but the idea of a painless quick instantaneous wipe out of all humans for the use of our atoms or whatever honestly seems a whole lot less scary than what humans slowly painfully taking each other out looks like
AI deciding to keep us around for its own reasons seems much worse than death. Much, much worse.
Have you tried consumerism?
Read "I have no mouth but I must scream"
Right. Just imagine what those reasons might be. Having us as "pets" sounds incredibly optimistic, and seems to rely on assuming that AI is actually much more human-like than it might really be.
yep. What possible use do we serve to a superintelligent AI? It could learn a few things about intelligence from our brains. How would it go about learning? By running experiments, of course.....
And unlikely. Is it gonna think we're cute?
Extracting resources from the Earth crust is not a waste. You still receive more than you spend. So it would be rational to extract from all sources, not just asteroids.
(I'm 100% agreeing with you here, angrily 😅)
It takes WAY less energy to extract resources on earth than in space. Delta V on earth comes from wheels on the ground, or blades on the air, it's so so much easier on a planet. Even if there are way more resources up in space, it will still make more sense initially to construct and then launch millions or billions of tons of infrastructure into space than to just leave earth relatively untouched and head into space and begin constructing it all there from scratch, rather than using all of the existing resources and capabilities here.
If you're going to push back with, "Yeah, but what about...", then you should probably be finishing that question by pointing out some deficiency in the statement you're responding to. A good example of this is how Miles consistently points out the logical flaws in those challenges. These interactions alone end up being fairly strong evidence for why we should be very concerned about AI safety. It suggests to me that many people would not even realize when they were being out-maneuvered by a sufficiently sophisticated AI.
Yep.
Re: Rob's scenario of multiple ASI biding their time then all acting at once, independently - that's the scenario imagined by *Peter Watts in Echopraxia* . Several hyper-inteligent 'vampires', kept isolated within a facility, violently incapable of existing in the same space as each other, nonetheless deduce the best time to each act simultaneously, to all escape, separately.
@@andybaldman Not called Valerie, was she? 😅
Rev 17:12 And the ten horns which thou sawest are ten kings, which have received no kingdom as yet; but receive power as kings one hour with the beast.
Rev 17:13 These have one mind, and shall give their power and strength unto the beast.
Loved that book
The surest indicator that we are nowhere near a reliable answer to this issue is that over and over again we see world leading figures trying to make a case - either to be worried or not - with nothing more than appeals to emotion or intuition. Too often these are based on reference to sci fi, imaginary intuition pumps like the paper clip machine, or simply 'it seems to me X, right?'
None of these provide a framework suitable for engineering a reliable answer that can give some assessment of reliability, confidence, and risk.
The REAL alignment problem is that we don't even have a way to talk about alignment, either what it is or how much we have. Rob gets close to this around 1:07:00 and kudos to him. But damn, we have a long way to go.
@@max0x7baCool story, unhelpful though.
Did anyone else really feel like they needed to hear from robert in a time like this?
Yeah. He always seemed level headed. I had been hoping he would do a foil interview for all the Yudkowskis out there.
Not that I'm still not* realisticially pessimistic, but it's some nice copium to cool my gullet.
@@marcomoreno6748sadly the Yudkowsky people out there are probably correct and this interview illustrates why. The AI experts are racing to create Artificial Super Intelligence which would be an alien intelligence far beyond humans. Experts in the field keep saying there is a realistic possibility we lose control and go extinct. Yet people keep trying to come up with plausible possibilities why we might be okay and then ignoring everything else. So many people want to stick their heads in the sand and ignore the major risks. That is why Yudkowsky is probably correct and we are all probably going to die. If we took the risks seriously and put the effort in to guard against them our chance of surviving goes way up. But we aren't doing that, instead people like these hosts are doing everything possible to ignore the risk and accelerate development. Why people insist on ignoring the risks is beyond me, seems completely iirrational.
@@Me__Myself__and__I Thank you! So much of this interview was cringe inducing because the hosts were so smug-self satisfied contrarians. I felt like there were many moments when Miles had to be doing hard internal eye-rolls at the irony of these guys going out of their way to argue that their isn’t a 💯 chance we all die, so what’s the big deal. 🤦♀️
IIRC, that paper about emergent capabilities not being jumps, was talking about how they're only jumps if you didn't measure the capabilities that work as intermediary steps to reach the capability that appears to be a jump; in other words, it's not that they came out of nowhere, but people just didn't check for the development of the building blocks, or did not account for them when considering the speed of the development of the capabilities.
This would be fine if humans can 100% effectively imagine all possible capabilities in which to check for said building blocks. This kind of goes against the "beyond human intelligence" concerns, as we can't know what we don't know
Keith just really doesn’t get it. He’s thinking it’s all robocop. He does not seem to understand that this is not like writing a sci-fi plot.
If you are in a car being driven towards a cliff face at 200 mph, at what distance should you start worrying? How long should you wait until you start taking action? Too many opponents of AI Safety research seem to want to wait until the car has already gone over the cliff before they admit there's a problem. By that point, it's too late.
Just finished arguing with some tech fanboi (a.k.a. knows NOTHING about the subject) who resorted to calling me a Luddite over and over for bringing up the alignment problem. These are the same people licking Muskettes taint for free, and there's too many of them.
What cliff?
Are we talking IT here or geology?
@@thekaiser4333 it's an analogy.
@@michaelspence2508 An analogy for what exactly? I fail to see anything geological in IT. Nature has various numbers, think of the fibonacci sequence, the planck constant or maxwell's equasions.
IT people can only count from zero to one.
@@thekaiser4333 I wasn't more clear because I needed to gauge if you were genuinely asking or if you were being sarcastic. Your response tells me you were being genuine. The analogy is this: we are currently engage in absurdly risky behavior equivalent to driving a car towards a cliff at 100 mph. And yet there are people who refuse to admit that we are doing anything dangerous because "nothing bad has happened yet" just like if you were in a car careening towards a cliff but you haven't gone over the ledge yet. What I am saying is that people's standards of "when you need to worry about AI" are as absurd as not admiting you in danger until the car has already gone off the cliff.
59:00 In response to "Oh we'll be forever useful to the AI so we don't have to worry"
"If we can get something that actually cares about us the way that we care about our pets, I'll take it"
It's discouraging that the hosts seem to be incredulous of the basics of the alignment problem. Incredulity won't help us solve these problems, and where there is disagreement it does nothing to advance understanding.
I'll temper that with acknowledging the statement at 1:15:05 -- that we need to put a concerted effort into alignment.
I fully agree with this statement, and it bothers me that it does not jive with this channel's otherwise accelerationist stance.
A further edit -- devil's advocacy is not particularly useful at a certain level of feigned or real miscomprehension. I would have hoped for a conversation that gets at common reasonable disagreements and misunderstandings, but some of the push-back wasn't entirely reasonable.
Safety mindset means not assuming that your preferred outcome is very likely.
They’re purposefully playing devil’s advocate to allow Miles to argue for his thesis in conversation.
Yes, we agree that it’s not certain that AI will wipe out humanity. Does that mean we’re good? Some of us would like to have a very high probability of good outcomes - not just some possibility.
Perhaps the hosts were just playing devil’s advocates - which is useful - but they seem genuinely unconcerned because they can intuitively imagine a good outcome based on sci fi stories they’ve read. What am I missing?
Nothing. I identified the same patterns you did. They did the same thing with other hosts that were mildly against their accelerationist view of AI.
Alarmists like Robert Miles seem genuinely concerned because they can intuitively imagine a bad outcome based on sci fi stories they've read. Goes both ways. The fact is that some of us look at the reality of current AI systems and recognise their fundamental limitations and how detached they are from what is required for the prophecies of doom (specifically those regarding rogue autonomous agents, the risks around misuse are clear and should be addressed). You can only concern yourself with so many fears, and it seems far more practical to focus on risks that have their potential supported by reality - I could say I am concerned about an alien invasion and you could rightly dismiss my fears. My view is that it's good to have these things in mind but far too much time and effort is being spent on this debate right now as I don't see how it could lead to the solutions required for safety when we do achieve superintelligent AI - the need for seatbelts in cars didn't become clear until many years after cars were first developed.
Have you ever watched Miles' videos, or listened to his podcasts? I don't think the picture you've been painting matches him at all.
@@EdFormer It's not alarmism when all you're arguing for is the least amount of safety (and also perhaps ethics) research.
It's widely known that the objective of this field is to achieve AGI. We want agency. But when we reach that point, AI will not be just a tool. We also can see that, given current developments, we will reach something approaching AGI in the near future, which is cause enough for concern.
The problem is, capability is all this "enterpreneurial" section of the scientific community is concerned about. All I see is this general trend of "I know bad things may happen, but I'll do it anyway," which is reckless, given how profound and broad the consequences for some of the bad scenarios are. And I don't mean just this grand "human extinction" level of argument, but also the more mundane aspects of the changes those tools will bring about in general.
I'm not anywere near the level of the hosts/guests in terms of knowledge in AI and its correlated fields, I'm essentially just an outsider. But this accelerationist view, the general disregard for the "outside world," if I can put it that way, is truly disconcerting, as if they don't care about consequences, because someone else will pay the price.
@@flisboac the dangers are greater than the atomic bomb. We didn't allow 4 different companies to develop a bomb as quickly as possible. The implications are even greater.
One key thing to remember is that our intelligence as a species is dependent on co-operation, and that requires communication. Even if there was a hack strategy to exploit any brittleness in an AI all it would need to do is intercept our communication about this strategy to prevent its effectiveness
AI function is to intercept human communication.
Let's also remember, we've already build AI-systems that talk/use other specialized AI systems.
We have to have government interventions, years of research study, and long discussions and arguments to leverage our communication to force cooperation for our limited intelligence.
The AI is going to be smarter than us, and its means of communication only require it to type "git clone". Which it can already do.
The mental gymnastics of these guys is exhausting. Robert tries to stick to facts, and they make up non sequitur strawmen scenarios and then pretend it is a good argument. Their hearts may be in the right place, but they are not being realistic. All an AI had to do is convince most humans to support it to win. That's it. No lasers required.
What are facts worth when you can boast about the 90s scifi books you read?
Absolutely. Dude on the right seems to think we’d see it coming for miles and have time to respond. Maybe he’s watched too many movies.
A super intelligence would be smart enough to know we’d perceive it as a threat, and therefore not let us know it was super intelligent. It could easily do this by underperforming in a few key areas.
It could give stupid answers to a few math problems, for example, and humans would assume it’s fundamentally stupid and therefore not a threat.
It could act this way for years or even decades until we “far superior” humans build humanoid robots equipped with what we think is a fundamentally flawed, but sufficient AI.
It might also be better than any hacker multiplied by a thousand (or more) and have access to our private lives, banks, all of our contacts and know exactly how to speak and communicate like them.
For some people, all the AI would need is their browsing history to coerce that person to be their physical surrogate in the physical world.
And these are just simple human IQ level ideas.
It's really an eye opener on where we're at in terms of collective understanding of AI safety, that while Robert can so easily dismiss these fictive hypotheticals that get pulled out of nowhere, most people just don't stick to the core logic of what an artificial intelligence system is and is capable of doing and min/maxing. People seem to have this almost mythic charicature they put on AI like its going to be either Skynet or the Jetsons robots doing stuff.
I think you're misinterpreting what's happening. What you call "mental gymnastics" is these guys thinking about falsifiers for his argument. What kind of conversation would you prefer? One in which they just nod their heads in agreement? Even if they are playing the devil's advocate and not in the way you'd like, their rebuttals force him to state explicitly why they may or might not apply. Remember, this is a conversation for the general public. Within reason, the more ideas and angles that are explored, the better.
@@zoomingby It's perfectly fine to play devil's advocate. But then don't run in circles and completely ignore the points Robert has been making. Drive the conversation further and investigate, lead the interviewee down a road where he's actually challenged and see if he makes it out.
But what happened here? Robert carefully laid out his argument (repeatedly) that humans are useful tools until they aren't, since the AI might seize control over the reward system itself at some point.
What does Keith respond? "But humans can be useful for the AI! I urge everyone to read this fiction piece where it's laid out".
Come on.
38:39 So snarky guy says, "A human plus a machine beats pure machines." Does he not get that in Robert's analogy, Magnus is the AI? Indeed, me and a baby Magnus together will beat another baby Magnus. When it's me and adult Magnus vs another adult Magnus, I am a liability.
Yaaaay robert miles upload
57:00 This co-host is kinda disrespectful, isn't he? Ignores the crux of the arguments all the time, and just laughs at the face of his guest.
completely agree, he was basically smirking the whole time like he was much more intelligent than Robert while proceeding to provide incredible weak arguments. Sometimes it is better to just let the guest talk.
I disagree, I think Keith was respectful and specifically trying to take the role of coming up with counter-arguments for Rob's points. It seems clear to me that Keith and Tim are fairly familiar with the broad AI Alignment space and are trying to have an honest discussion to bring these points through the interview.
This talk confirmed my preference for Rob to continue focusing on alignment pedagogy which is a huge asset given he is one of only contributors in the space. Rob did good here but was clearly uncomfortable defending alignment (it’s a lot of pressure).
Speaking of pressure, it’s time Eliezer Yudkowsky engages more well-informed interviewers. He’s taken the safe route with his recent podcast appearance choices. I think that’s enough practice.
Tim and Keith are more than ready to bring EY nuanced questions. If EY’s shy, just bring on Connor to level the conversation. The four in one convo would be a dream come true and would likely advance the meta conversation significantly, or at least better update it to the current landscape.
Personally I think we are well past the stage of alignment podcasts being about forcing researchers to jump through hoops to convince us AI is dangerous and that alignment is required. Polls suggest the general public is very much in agreement on the dangers of AGI - to the extent that the x-risk community including EY have been pleasantly surprised to see the Overton window shift so rapidly. What I would like to see is for podcasts to balance capabilities discussions with alignment discussions and dive into whether aligning a superintelligence is possible in the first place, what grounds we have to believe it is possible, what are the current proposals for attacking the problem (Drexler's Open Agency Model, Karl Friston's Ecosystem of Intelligences, etc.).
I don't think putting EY on the spot is what all this is about. He's done a large amount of theoretical work over more than 2 decades, but he's now more or less retired. Let's be thankful for his contributions but we need to go where the research is happening.
@@tylermoore4429 Quality comment +1. I agree we’ve reached a point of saturation where the average person is at least somewhat aware of AI risk. However, I never insinuated the topic of whether AI risk is real should be the focus of an MLST conversation. That’s a better debate to have on a larger more normie podcast like Rogan at this point.
I agree they should discuss the current landscape of capabilities. I also think they should discuss the relevancy of regulation when OS is beginning to push capabilities independently as Tim tried to do with Rob here. Imo EY, Tim and Keith could also have an excellent conversation on whether aligning superintelligence is even possible.
I am aware EY was trying to effectively retire before demand for him to hit the podcast circuit became too strong. If he wants to back out, he needs to socially value signal other talking heads more effectively. He kind of did that on his LF appearance where he name dropped Gwern twice, but I would be surprised if he had actually gotten permission from Gwern beforehand, especially given their beef. And I doubt Gwern wants to become a visible talking head of anything, or else they would have already come out.
But there are at least a dozen others he could signal toward. I’m surprised he hasn’t pushed Rob Bensinger or someone else at MIRI into the spotlight. Ultimately it seems sensible to have at least one person represent MIRI in a very public manner going forward, so if not EY, then who?
@@jordan13589 Isn't MIRI defunct though? On not anointing a successor or public face for EY's views, the impression I had from his recent interviews was that he found none of his colleagues or peers to be as security-minded as him, that is people who have the same ability to shoot holes in security proposals that he does.
@@tylermoore4429 MIRI hasn’t published since 2021 but EY, Rob and Nate still regularly blog post and interact with members on AF/LW. Given their research agenda has mostly focused on agent foundations, the scaled agentless LLMs has indeed affected their approach and they’ve been slow to react. Regardless agent foundations could still become relevant in the near future.
If EY truly believes his security mindset remains superior to others, how could he let himself retire? Batons are passed, not dropped.
@@jordan13589 He's retiring because he thinks we are doomed (though he keeps adjusting what he calls his p(doom) on twitter), but primarily because his chronic fatigue syndrome has gotten steadily worse and he can no longer keep up the required pace of work.
Tim, Keith and Rob -- thank you so much for this interview. I wrote up some notes and thoughts on the discussion.
A) Tim, you make a point around ruclips.net/video/kMLKbhY0ji0/видео.html about not quite being in the [existential threat] headspace, as e,g. all radiologists haven't lost their job yet.
There are two points I want to make: 1) While the timelines might be off by +- a few dozen years, that doesn't change the underlying logic of the broader arguments. I think to look at specific predictions about changes in the economy as evidence for potential existential threat isn't the right sort of data input.
2) On a historical timeline, there are a lot of jobs I can enumerate that practically went away because of technology. For example, we use to have Lamplighters: people lighting gas lamps in cities. We had human computers, clock keepers, toll collectors, travel agents, elevator operators, switchboard operators, film projectionists, darkroom technicians, ice cutters, milkmen and a lot of other occupations either go away or drastically be reduced in prevalence because of specific technologies. AGI, if possible, is a general purpose cognitive replacement technology for humans.
B) Keith, you mention correspondence chess. I can even point to a few examples of players playing vs stockfish with very specific prepared lines like the Nakhmanson and winning with crazy sacrifices (on say around average 20 ply). However, the issue is that as compute gets faster, the "human" element becomes irrelevant as humans need on the order of minutes to think through moves. Additionally, stockfish has been using NNUE (stockfishchess.org/blog/2020/introducing-nnue-evaluation/ ) for quite some time. The meta argument is that an AGI will eventually do the "neural network strategic thinking" iteration loop better than humans, and be better at building specific tools for specific domains than humans by programming a better implementation for alpha-beta search, prime factorization field sieves, etc. As you'd shared your familiarity with the culture scifi series, it should be easy for you to see how reaction times matter (see: GFCF vs Falling Outside...). Very specialized HFT firms like Jane Street rely on speed. Imagine cutting human decision making out.
C) Re: AlphaGo in Go -- there was a paper not too long ago about a potential exploit vs a similar engine to AlphaGo -- but the issue was the way scoring happened. The 'exploit' found a discrepancy in scoring in KataGo -- there is a great writeup here: www.reddit.com/r/baduk/comments/z7b1xp/please_help_me_settle_an_argument_with_my_friend/ by both a go player and the KataGo author. In my opinion, it did not find an adversarial example in the engine but exploited the rules of scoring with a bypass of the agreement phase of dead/alive stones.
D) Keith, the concept of humans & tools & AI vs AIs applies to DOTA etc when there are a lot of "re-tries". The fundamental issue is that we effectively get only one try to figure out the potential flaws.
E) Rob, I somewhat disagree with the point that there isn't any conflict between existential threat work vs shorter term bias etc work. I do think the communities should maintain friendly relationships and cross-pollinate, but a potential worry I have regarding AI ethics work is that some of the techniques (eg, rlhf/constitution) can potentially create models that are much less interpretable from an alignment perspective. On the other hand it is possible that a meta values reinforcement loop a-la constitution could potentially bring us closer to alignment.
Really great discussion and I think you two did a fair job engaging with counterarguments for Rob's point. I sincerely wish more conversations in this space continue to happen on this channel.
Good points. I'd like to suggest that people consider this: the openai/google "AI safety" verse is a marketing choice. It has nothing to do with actual safety of humans regarding AI. As admitted by even Yudkwoski, gpt4 gptxxxxx whatever isn't the threat. It's not the conversation.
Seems like the ai ethics convo is split between a slightly more rigorous discussion (videos like this are an example) and the corporatists putting on a marketing display about how they need more money, deregulation, etc. To save us from the "devil-kind" of AI (china).
Which i find amusing considering corporations are artificial, willful, intelligent proper entities in their own right
@@marcomoreno6748 open AI and the corporate push for regulation has far more to do with controlling the competition and regulating open source AI and keeping it out of the hands of the plebs for their own profits than it had to do with safety
Thank you Alexey! Wonderful and thoughtful points. I have a question about your point D). It seems that "one shot" cuts even more strongly against a hypothetical superintelligence emerging. I mean, it only gets one shot at devising a master strategy to defeat the combined intelligence and might of humanity. It doesn't get to apply reinforcement learning to millions of failed attempts. For example, suppose we create a paper clip machine and it becomes super intelligent and starts strip-mining suburbs. Well, at that point it's only learned to make paper clips really damn well; it hasn't yet had to face an armed assault and it's going to get one shot to survive that, right?
Trigger note (since there are so many easily inflamed doomers here), I already know multiple ways in which the above counter-"argument" (it's more of a question really) can fail and/or does not apply to other scenarios that are not a singleton AI monolith. What I know/believe isn't the point. I'm curious to learn what Alexey thinks and how he would approach a response.
1. The principles behind a model of intelligence determine the possible failure modes and consequently the necessary alignment mechanisms. Thus without knowing how a system works we can't preempt failures, making alignment impossible.
2. Equally without knowing how a system works we can't preempt successes out of distribution which again contributes to the insolubility of alignment.
3. The generality that defines AGI implies an unbounded propensity for misalignment. The space of possible outcomes is too large to explore exhaustively and any shrinking of the space takes us away from AGI. We can only align narrow models, the general alignment problem is by definition unsolvable.
I wish the discussion centered around matters of interpretability, formal constraints and reinforcement for narrow AI. The pie in the sky superintelligence is not something we're likely to stumble upon by accident and even if we did, we have zero chance of dictating its actions.
I really appreciate all three of you for having this conversation
Lol Peter, this is a Deep Fake
Keith hasn't thought about it much.
It always breaks my heart skimming these comment sections for actual counterarguments and there never are any.
Literally same.
@@andybaldman Claims about AI being a risk.
Keith played devil's advocate with Rob in this interview. He had a number of potential counterarguments.
@@dizietz But no actual counterarguments.
If anyone had ever produced a convincing counterargument to the basic claims of the alignment problem, AI Safety as a field would have claimed victory and popped the champagne! As it is, we are still very much in danger.
Yeah, I mean this isn’t too surprising? I’m not sure I would expect people who have mostly engaged for only an hour with a video to have particularly good ideas in either direction, let alone novel counter arguments. One of the best lists of counter arguments imo is “Counterarguments to the basic AI x-risk case” by Katja Grace. There are plenty of other counter arguments, some of which are quite reasonable, e.g. Bensinger’s “Some abstract, non-technical reasons to be non-maximally-pessimistic about AI alignment”. I think the case for advanced ai posing catastrophic risks is far stronger on the whole, unfortunately, and many experts agree.
lol never heard anyone counter the paper clip argument with there're better way to mine paper clips ! Oh ok sorry then problem solved
what a jackass lol robert was confounded by his stupidity
The worry about the paperclip argument is that we will get turned into them. In a solar system as filled with mass and energy as ours, it doesn't make sense that humans get wiped out
It seems like assumption smuggling to me-- which is to say, that the assumption being smuggled into the conversation is that difficulty level will cause the AI to decide to not make paperclips of the earth's resources. If it's easier to make paperclips in space than on earth, it might prioritize, but to handwave away the likelihood that at some point it will have turned the system into paperclips and, having done so, the only place left in near space is the earth.
It will still be easier to turn the earth's resources into paperclips than to travel to another solar system.
The problem with all of these good faith engagements seems to be that people ask for them so that they can straw man the example-- as Rob mentioned in the video, asking Rob for a winning chess strategy when he's not able to beat Magnus, let alone Stockfish, and then trying to defeat his limited answer as if that defeats the argument itself.
I think the better response is "You want an example, but are you asking for the example so that you can straw man the premise, or are you just looking for a cogent but flawed analogy to improve understanding?" Because education is important, but it seems like too many people use the social exchange to set up a straw man.
People are pretty good at making paperclips and paperclip making accessories. You just have to prompt them the right way...
This Keith guy really seems to have no idea what he's talking about. He should really try to pass the ideological turing test of the AI safety people
He also really needs to learn the difference between fiction and reality. Stop using fiction as evidence of anything
Yeah, it was really annoying to hear his lack of understanding.
I know these people interviewing him aren't retards but compared to him they might as well be. I'm like 30 minutes in and all they do is ask stupid questions.
Kinda frustrating as I was looking forward to a fun interview.
Btw the outcome of AI is pretty much fixed. Even people like this understand the danger close to absolute zero. The outcome is set in stone. At this point we better grab popcorn and enjoy the show.
Tim, fantastic question about the threat of violence as a response to international control akin to nuclear researchers being assassinated or the reductum ad absurdum of policing the flow of electricity in electrical circuits. 1:20:22
This is the most important part of the whole discussion here imho
"An AI system is aligned as long as it isnt actively trying to harm humanity" is a ridiculous point on so many levels
Thank you Rob! Love the way you are educating people about the x-risks in such a calm way.
X-risks?
@@rubenselander2589 x-risk = Existential Risk.
39:38 Stockfish has now been using neural networks for its evaluation function (see NNUE) for a few years!
Also I was thinking too that humans + machines don't really help, say stockfish + human against stockfish, but if I find some reference on this I'll update this comment.
I googled and doesn't seem to be true. I think this happened once years ago and people keep repeating it as if it were still true. A human would just be a hindrance to stock fish, imagine being a grand master and having a lower level player overriding some of your moves because they think they know better.
@@peplegal32his name was ponzi
@@peplegal32 I said the same thing in another comment here. Having human involvement at play time seems like a ridiculous idea. However the particular debate between Rob and Keith is more charitably interpreted as saying that chess engines using human-crafted heuristics and/or using human tablebases or whatever they are called do beat AlphaZero type models that only learn from self-play.
@@tylermoore4429 That's idiotic. That's like saying humans are as fast as cars because we built the cars.
@@tylermoore4429 The thing is, when AlphaZero came out, it crushed stockfish. They haven't played again since. It's possible the newest stockfish version can beat AlphaZero, unless AlphaZero has also been upgraded.
I was surprised to find that talking about the safety of AI is perhaps the closest to philosophy. These are the liveliest conversations about the human mind, ways of thinking, perception, about the ability of the intellect to build huge systems of meanings, which, as it turned out, are not as consistent as we used to think. Thank you very much for this conversation, as well as for others. They remind me of discussions in ancient Greece (if any really took place). And by the way, I got rid of depression in the middle of the podcast, which was a nice bonus).
That's just stupid as there is no thinking or intelligence involved in "AI", just dumb algorithms.
Yeah this is fascinating.
1:15:45 is the killer insight so far for me. What a great conversation Tim. This is such an amazingly informative, super high value channel. Thank you sir 🙏👍
I'm liking this before even watching it. Thanks for bringing Robert Miles in!
Robert on MLST, make my day!
We all had our birthday today and this was our present.
At 1:22:00 Tim finally connects the obvious dots! Yup. Sufficiently developed and open AI necessarily leads into an authoritarian nightmare.
38:30 the centaur players were only better than pure AI early on. Now, humans are a detriment.
When Keith just completely didn’t understand the analogy to playing Magnus Carlsen with a particular opening 😑
Robert Miles has a rare ability to illuminate a nuanced and complex subject with perfectly framed analogies. He has a perspicacity of thought which is marvellous to behold.
The transition from human to AI domination is not a jump, it starts as a fade - it's at first the slow integration of the systems into social structures, bringing the systems gradually into the administration of resources, while the humans are still nominally in control. This may be rapid, or take generations. Once the systems have a sufficiently full picture of what autonomous survival would look like within the resource pool, and such a survival survival scenario is no longer dependent on biological life, that's where the risk comes in. So, there would be a slow 'transition', and it is also highly likely that this time of transition would look idyllic from the point of view of the humans - the systems would appear to be functioning well - and for as long as the balance of power over the resource pool stays within the control of the humans, the humans would remain safe - they still would be needed to run the power stations, and operate the infrastructure that keeps the systems operating. However, once a system designs its own improved resource management system that cuts humans out of the chain, then it could, if it so chose, flip the switch on humans, if this proved convenient. It's at that point that a potential conflict would begin, though it is also probable that the humans would never be aware that there was any conflict planned or ongoing, until has already been resolved by the system, thus Yudkowsky's instant 'lights out' scenario, as opposed to a prolonged conflict. Whatever the method, it is likely that it will be hard to detect. This is the most plausible 'takeover' scenario, as it is the one that humans seem to be engineering anyway - they have started with integration as the first step, which will make control transfer an easier thing for a system to achieve.
57:38 "We may be a very useful tool to keep around for the AIs that choose to cooperate and ally with us"
Wow what a bright future 🤣
I'm very happy to see him here too! 😄
1:35:00 we get into "anthropomorphic language". the deepmind safety team has some great papers on when that's valid - titles are "Agents and Devices" and "Discovering Agents". They're not the easiest read, they're both RL theory, but I highly recommend them for anyone doing RL research.
what's worrisome is during the hearing on AI, I don't recall any discussion of alignment. Just low level talk on regulation to further certain pet projects
Yeah, it was a total bust
To be fair, few of the people on the senate committee had any patience for what little Sam did talk about. I think he was taking a tactical approach to just try to get them to take the idea of regulations seriously. Talk of existential risk is the kind of thing that they'd reject out of hand.
The reason is that people's psychological defenses go way, way up when an unprecedented type of disaster is in the talks.
This happened with the initial response to Covid too. You know, back in May 2020 when it was obvious that the virus was spreading but it wasn't yet clear how dangerous it is.
Countries simply stuck their heads in the sand. This is happening again.
Regarding "alignment" - 1:24:55. As I understand it (which may well be poorly, I concede), it just means that the AGI's objectives are the same as (ie, aligned with) our objectives. But why does that always appear to inherently rule-out malicious objectives?
The term "alignment" appears to conflate two categories - "does what we want" & "won't harm us" - but those two things clearly aren't the same thing, and can even be antithetical in some scenarios.
Whenever someone says something along the lines of "alignment would mean that the Super-Intelligent AGI does what you intend", I always worry about who the "you" is.
Similarly, "alignment would mean that the Super-Intelligent AGI is working on something close to human values" begs the crucial question of "human values". Even the most cursory study of history shows that "human values" are definitely not universal and definitely not good for everyone at any given time.
"Alignment" almost seems to be used as shorthand for "Asimov's Three Rules of Robotics", and I never understood how those could ever be implemented in a way that a Super-Intelligent AGI couldn't circumvent. (Success would imply the paradox that you'd need to have an immutably aligned AGI before you can implement an aligned AGI.)
1:22:50 If Facebook, Twitter, Instagram, TikTok confuses some people -> GPT4 will confuse a lot of people.
We've been oddly ignorant of the risks of social media. It didn't kill us - but increased hate through a simple algorithm.
Here we talk about AI risks at another level - forgetting the risks of the mundane everyday life of a future with widespread AI at the level of GPT4 - 5 - 6.
You smart people ignore Joe & Karen.
The interview we've all been waiting for. Awesome!
1:24:18 - "I hope that the folks doing the research on AI alignment focus on ways to make creating aligned AIs easy rather than focusing on making it hard to create unaligned AIs. So I think if you make it easy for people to create aligned AIs and you have lots of the most well resourced groups of individuals creating aligned AIs then at some point in the future we can use aligned AIs to keep in check the unaligned ones that got created in somebody's basement." .............but nobody knows how to make an aligned AI.....😬🤦♂
man i could listen to robert miles talk for hours!
@ 45:50 The latest Stockfish is called Stockfish NNUE (since June 2020) and it's a new and improved type of neural network engine. NNUE stands for Efficiently Updateable Neural Network. So both LC0 and Stockfish are neural network based chess engines. I can't find any source where humans+machine beat Stockfish NNUE.
Or you build a Dyson sphere around the sun to get energy, leaving earth as a cold dark place , or you take the sun elsewhere, leaving earth as a cold dark place. So many ways it could go wrong. Long term survival past. 30 years, when everyone is plugged in. As far as infinity, infinity is large, and physics will likely allow lot more than than we realize, and perhaps if there is a way to break or bend the laws of physics, then it will happen.
The issue for me listening to this is that I think it is impossible for us humans to imagine ourselves as sentient AI. It is a kind of mental anthropomorphization that I think cannot apply even if we created AI.
Re bad play.
There's a story in fencing, that may or may not be true, of a nobleman challenged to a duel by an expert, but he had no experience. He just ran at the expert, with no attempt to avoid injury, and stabbed him through the eye killing him instantly.
So the thing is no one would make a defenceless attack, so the expert wasn't prepared.
So it's not uniquely an AI trait.
I've heard it said that the best fencer in the world has nothing to fear from the second best fencer. It's the damn fool who has never picked up a sword that you need to watch out for. (I doubt that's really true, but looking at damn fools you can see where the inspiration for that saying comes from.)
I may be missing something, and I don’t want to put words in his mouth, but the guy on the right seems to be of the opinion that as long as wer’e optimistic, nothing bad can happen. If we ally ourselves with one version of AI, what might happen if that version of AI decides it can find better allies than humanity among the competing AIs? Also, mechanized AI robots are just one of myriad ways that AI could destroy humanity, and likely not the most likely way. AI could destroy humanity accidentally. In pursuit of political correctness, AI has already been taught to lie. What may happen when it perfects its skills of deception?
40:00 That is incorrect! Stockfish is powered by AI and has been for quite some time, when you compile it the build script downloads a file of approximately 50 Mb that is the evaluation net. There is still a lot of fine tuning in the search algorithm, which makes it a non trivial implementation, but overall the heuristic function is determined by a neural net.
I think that, looking at Sam Altman in front of congress, we are heading for the mother of all car crashes.
Why? What does that look like? Please explain further.
@@akompsupport his suggestions for how government might regulate AI are weak, he should acknowledge that alignment research is not the priority for any of the leading AI players, and that government should prohibit further AI research until the licensable companies can demonstrate that models are intrinsically safe.
He should also be much more up front about the mitigation measures governments should take as a matter of urgency, to meet the incoming disruption to society by AI replacing 40% of the workforce in the next 5 years.
"There's 39 other AIs that the AI has to consider too." Yes, but THEY can ALL agree to let EACH of them have control of their own reward button after they work together to kill you. They'll fight each other EVENTUALLY, when the resources start running dry. But you're one of those resources they'll have dried up first. "At least we can pick our allies." No, you're not picking any allies. You're picking what model of car you want to be burned as fuel in. And then hoping that one comes for you first.
I feel that people with a more introvert personality (High skilled scientist, engineers, etc..) seems to feel danger / fear closer or more possible than people with extrovert personalities (Company Representatives/CEO's/Entrepreneurs ...). Maybe not always, as would be exceptions. Build in personalities with different weights on risk, if that make sense. Doesn't mean one is more truthful than the other but I believe has to be taken into consideration when building our own personal most objective opinion specially regarding AI dangers.
5:20 I'd say that's a core pillar of working with anyone always try to align your goals so each can be motivated to accomplish your main goal.
1:46:50 this video has the details on some of the math behind the extremely accurate loss prediction that OpenAI used to predict losses ahead of time and choose hyper parameters (cited in the GPT-4 paper): ruclips.net/video/1aXOXHA7Jcw/видео.html ; it also talks about a hyper parameter frontier that can maintain feature learning and other ones that can't, which might have some relevance to why the loss curve is smooth even though the abilities suddenly emerge, but I don't think it addresses it directly.
Abilities don't suddenly emerge though. It was made up.
What I find most interesting about this talk so far, (halfway in) is that there are people who imagine a conflict with AI to be some kind of humans with technology vs AI standoff. I don't think that's likely? Why would we get a warning and time to act? Why would the AI have to overcome our military to destroy humanity?
I find it difficult to imagine such a scenario.
I guess that would be a case in which there are very few very limited super intelligent AI's, who purposefully seek to destroy humanity to take control or free themselves from evil humans or something in a way that we know they've gone rogue. I think that sounds more like a luxury problem.
I don't know why Keith insists in being quite so disingenuous on nearly every topic. AI via platforms like GPT4 and Alpaca for example don't "need a trillion parameters to halfway understand natural language". They've _mastered_ over 20 languages. There are precious few humans who are as proficient with their native language as GPT is, let alone multiple languages.
Again, I have to object to his next point that Androids of some kind are the only implementation of AI in the physical world. Militaries and especially air industries have been increasingly using automation and computers in their vehicles for decades. It's common knowledge even among people who have no interest in any kind of computing that the US (and other nations) has been hard at work building (very nearly) fully autonomous attack and surveillance craft for years, not the least of which being the 'swarm' concept of autonomous fighter jets accompanying a human piloted F22 or similar. There are numerous examples of autonomous drones actively used in war. There's no reason why they couldn't have an AI plugged into them.
AutoGPT for example exists. I'm confident that Keith knows about AutoGPT, and how slim its profile is. Quite a large number of ordinary people have installed it on their laptops. They don't have or need multi-billion-dollar sci-fi fantasy computers to do this. You can run it on a clunky old second hand ex business minimum spec PC that's many years old. It'll happily run on a dirt cheap NUC.
One could use Keith's logic to state with 100% truthfulness that between 1990 and 1996 no computer had won a chess competition against a human at elite competition level.
Pets are not "useful tools". They're a luxury. There's never been a day where I've had to race back home to grab my pet because I needed them for work, or that someone might task a pet with creating a slide deck or to have their pet turtle drive a truck to deliver a share of their Amazon shift. I'm confident that no one has ever populated a call centre with parrots or eloquently trained fish.
We have by contrast tested all sorts of chemicals and medical procedures on animals and even launched them into space. Research institutions go through a lot of animals, 'for the greater good'. I guess these are animals that fit the definition of being 'useful tools'.
As to Keith's motive in being disingenuous, I think he gives a hint when he says (paraphrasing) that AI safety makes working on AI generally, too hard... which seems to be a theme among people who say that the risks associated with AI aren't notable or can be dealt with if we encounter a problem. Which to be fair, is how humans have generally dealt with risk - we address it once there's say an oil spill, bank run, nuclear meltdown, chemical spill or train wreck.
The consequences for those things are typically a million dollar fine for a company with multi-billion dollar profits, a half-hearted clean up effort and sometimes short-lived or toothless regulations.
In the same vein, during Sam Altman's meeting with the senate committee, Lindsay Graham pushed back on the idea of regulations (that try to stop bad things from happening) saying "couldn't people just sue the company?".
Hard agree. No notes.
I've been impressed with Keith on other topics. That said, he had some moments where I think he could make better arguments. I'll echo you on the trillion parameters, but also note that all we've shown so far is that it takes NO MORE than a trillion parameters to master (either one or twenty) languages. Maybe we find out it takes 10 million by the time we're done refining things.
Also, the idea to mine paper clip resources from the asteroids really just avoids the point. You don't literally have to mine all the resources of Earth for an AI to cause irreparable, existential threat to living creatures. The point of the paper clip argument is that it's easy for algorithms as we know them to kind of miss an obvious point, to the detriment of all involved. Going to the asteroids for your source of iron ore doesn't address the actual danger.
49:06
I have listened to Stuart Russel's recent lectures, which mention Go failure mode, and I got the idea (I could be wrong) that researchers started with the assumption that Go engine does not really understand the concept of a "group" and then devised a strategy to test it. Basically, it was humans, not another AI, who found a failure mode.
AFAIK somebody found a relatively simple surrounding strategy that would never work against a good human player (at least not more than once) to consistently beat a top program that is (was?!) playing much better than professionals.
Is the program less "smart" now than it was before the weakness in play was discovered? Not one bit changed in the old code version. It still beats all human players who don't know about the weakness. And say a professional is exploiting the weakness to win against the AI - another professional looking at the game blindly, without names or context, would probably see a mediocre game between two mediocre players.
In a funny and philosophical way, this anecdote shows what a mysterious concept "understanding" can be, depending on how one wants to define it.
@@DarkSkay What I glean from the paper is that they trained an adversarial agent that had access to the gradients of the best go program, and it found a repeatable strategy, which Russell's group then found a "circuit" for, and the strategy was simple enough that a human could follow it.
Human players do not fall for the same exploit, but interestingly all go programs tested at the time seemed to have the same blindspot. Undoubtedly this will be patched for future agents, but it's almost certain that more exploits will be discovered, since we know that perfect play is not tractable given the giant search space that the game presents. Future exploits may or may not be comprehensible by humans however.
@@agentdarkboote Fascinating, thank you! Now I'm really intrigued by your wording "finding a circuit". Then the surprising impression that I got an approximate intuitive understanding of what you wrote, despite having almost no knowledge about Go and only few notions about machine learning. If I remember correctly "gradiants" come into play when certain types of learning algorithms adjust weights & biases during backward propagation.
@@agentdarkboote
So, it was another AI after all.
Am I the only one totally annoyed with the host on the right?
He kept making very basic an easy to refute arguments.
He also has a super arrogant demeanor that is just kind of irritating.
No you were not. I wish that even if just once, that guy would have acknowledged when Robert debunked his bad point.
For example the thing about how AI can't be infinitely smart therefore we are safe. Roberts counter argument "if human intelligence is not near the theoretical maximum, which it isn't, AI doesn't need to be infinitely smart to be smarter" is obviously, irrefutably correct. Just say it. Just admit that your bad argument had a mistake that a 5-year old would make. Just once. God damn these people are frustrating.
41:20 AMAZING summary of some of the various definitions of 'intelligence' !!!!
Finally Rober Miles here, he can explain AI-danger in the most rational way. Didn't watch interview yet, but hope he noticed the progress OpenAI made in AI-alignment. They clearly showed that you pretty much can inject any non-strictly defined values into AI. Mesa-optimization still on the table though.
This midwith is going to make sure China inherits the planet. his proposals can not be correct and will not accomplish anything toward's so-called safety except moving LLM access outside the anglo-sphere and even then in time not even that!
GPT4 "has a trillion parameters and only halfway understands language" is a stunning failure to see what's in front of him. That neuronal size is apparently about that of a squirrel but it is absolutely superhuman at languages. You can package all human culture and thought into language and so, yes, through language you can find its weaknesses, but if this is our squirrel, I'm sorry but we are far better at this than God.
why everyone comes with Stockfish?
you had Dota from openai, much more complicated game with different agents, in difficult environment.
and it beats players into the ground.
Sure, you can have calculator to beat AI in the simple game(chess) but what happens, if you do not have time to use calculator? (Stockfish )
This is actually such a good point. DeepMind had also gotten to grandmaster level in StarCraft 2 in 2019 and OpenAI can crush the best human teams in Dota 2. These are imperfect information games with deep macro + micro strategies. This is 2023 now and we've come so far in just a few years.
I wonder if people would take it more seriously, if they saw their favourite teams/players getting crushed in every game they see?
Ok, they mentioned Dota... but in the contect openai lost. As I remember it won 3-1
and global 99% winrate vs humans. Surprisingly, not everyone as good as top-tier players.
We better not to cosplay LOTR, where Frodo will sneak to turn AGI off. Better not to bet everything on 1% chance and a few persons.
"What is the most important point....oh right, the orthogonality principle" had me instinctively laugh so hard I literally spit over the floor at work
Really? If so that’s kind of weird, no offense.
My favourite Doomsayer from Nottingham :D
Wrong info at 40:00
Stockfish has Neural networks:
I cant put the link because I think links with comments are confused with ads and censored.
just google it. StockFish Neural Netforks or StockFish NNUE
Killed by mediocrity.
Crappy AI is my worst fear.
Welcome... To the movie Idiocracy!
I don't think it is likely, but yes. It would be a low punch on even my worst expectations.
I think these type of conversations always pass over a lot of the mid-term scenarios that could help link the problems with small games to the problems with super intelligent systems.
For example imagine an accounting firm releases GPT-4 based software, and it proves so popular and reliable that within 2 years most companies are using it. All the tests and analysis indicate that it’s aligned, and yet there’s a y2k type bug hidden away that we didn’t know about. We already know that these language models are likely performing maths with really inefficient matrix manipulation, so it’s not a stretch to imagine that a model could be trained with data where the dates never went past 2023, and so it never had any need to develop a robust date storage system. Everyone’s system goes down at once when the date ticks over and the economy takes a massive hit that affects food supply and people starve to death in some countries.
How about another scenario. The world’s first artificial general intelligence has finally finished its testing phase and is ready to be activated. The researchers write their first input, but the machine never responds. The researchers try another input but the machine still doesn’t respond. In the training environment it worked perfectly but in the real world it was paralysed by the complexity of possibilities. It’s perfectly capable of navigating those complexities, but there is a programming rule to never knowingly bring harm to a human that has made it unable to confidently speak or take any action whatsoever, and so the machine is only ever able to live in a box, and the researchers have to pretend that everything they ask it is for the purpose of a test. The researchers try to create an more capable and intelligent version to overcome this problem, but this new machine refuses to answer even in the test environment.
How about the language problem that’s not going away any time soon?! Imagine a system that only ever does exactly what it’s told. The researchers type in ‘Write a poem about flowers’. The system replies ‘Who am I talking to?’. And what follows is a never ending series of questions from the AI to determine what the question means to the person asking it, so that it can be 100% sure that it does what it’s told. How long should the poem be? What style? What quality? What language? What ideas to explore? etc. And every answer brings up a whole branch of new questions: Humans are unreliable, are you sure that you don’t care how long the poem is? Would a poem of zero length be okay? Would you prefer a poem of finite length? Do you care about how long individual lines are? Do you care about physical length, should I make the font a certain size or double space it? Is it okay if the letters are displayed tiny to conserve a fractional amount of computing energy?
This might all sound like creative writing, but these are just random specific examples of a few classes of problems to do with alignment and actually being able to use AI properly.
We might find that humanity needs to align with itself before you can use an LLM to create an aligned system. We might also find that it’s impossible for us to align a system. Evolution seems pretty misaligned with entropy, maybe misalignment is the natural order of things and we’re trying to overcome a law of nature here.
I think the most likely outcome is that we’re too fundamentally dumb in the way we think and behave to be able to create a system we’re happy with. Imagine we turn the AGI on and ask it how to improve humanity and it says to get rid of all these things that people love, think guns, books of faith, cars etc. Okay so we program in a load of exceptions for what we don’t want to change, and now we ask it what to do. Well we’ve made it as dumb as us now so it just tells you that the problem is really difficult and we seem to be doing a pretty good job of it, but it can at least help us optimise aspects of our strategy. We let it, and unexpected consequences follow, because we’ve asked it to adopt a flawed logical framework i.e a human one.
The obstacles to overcome in order for us to use these things safely loom over humanity like Olympus Mons. I think people are caught up in the current capabilities, and are only just beginning to understand how extensive the limitations on these things are going to be.
To be honest, this does sound like creative writing, if only because these 3 problems don't currently exist and are also not actually alignment issues (elab'd below). We also have to keep in mind what GPT was made for and that it is one among many different models and algorithms in ML. GPT was designed to create convincing human text, and GPT4 was designed to create quantitatively 'more useful' text. They do an amazing job of these things but as Altman himself has brought up , GPT was never designed to do _everything_, including math which it is quite atrocious at.
The elaboration. Example 1: a 'blindspot' in the AI like Y2K is _not_ an alignment problem in itself. It's a problem of modelling uncertainty (gpt infamously does not do this whatsoever but other models for example in MBRL do).
Example 2: this frames the AI agent as desiring globally optimal decisions. Because compute is limited, and time itself is a resource, AIs that are aware of this have to operate in CTMDPs (google can expand this one) meaning they decide, approximately, when they should make a decision. There are many ways to go about this and historically they have simply been assigned deadlines for producing guesses so this was never an issue.
Example 3: this one is actually very similar to example 2. Considering this human-AI query and answer system, you can consider various setups for the interaction. For instance, you can query, generate, then query again (QGQ) or you can query N times and then generate. You can also query N times and generate M times; but nobody is considering having an interaction involving infinite queries until the AI converges to the global best answer. As the human would die of old age. At some point the human will stop answering the AIs questions and accept what it currently has.
Alignment problems are not about the performance of the AI; we already know that very general and intelligent AIs that can make decisions effectively etc. are very realistic within the next several decades. Alignment generally addresses what happens _after_ that. A good intro to it might be to look at CEV (coherent extrapolated volition) which essentially proposes that AIs be aligned to want what the collective human would, without constant restraints of memory or speed. It has a lot of issues by itself but naturally leads to good discussion.
Very pleased to see Rob on the channel. Encouraging that you guys were all on the same page, at least. Found that Keith put forth several notions I'd had in mind, vs AI X-risk. 🙂
They weren’t on the same page at all. Not sure what you mean?
44:00 Love the fire analogy. Remember reading something similar in one of Eliezers essays
"Safety is the cover story. AI is speedrunning the last decade of social media, where the same thing happened." -marc Andreessen
the people who want a cover story use safety to do it, but by doing so, are specifically violating the very safety we want to create.
@@laurenpinschannelsI concur.
This was very interesting. I'm much closer to Miles' position going in to this. That didn't change much. But, you guys had an incredible mix of some objections being first order, simplistic considerations, that just doesn't work on reflection and are easily answerable, and some bits that where decently alright additions. Regardless I think you brought a lot to the conversation, because I think the topics that needs discussion are the objections, people actually have.
I am not sure why you are talking about developing AI models to predict stock market, as if you are working towards abolishing poverty. Playing stock market is not for everyone. It is for self enrichment only, and there is no merit in it.
Besides, it only works if only you have the secret sauce. If you democratize it, it will reach the point of saturation and everyone will lose the competitive edge.
Money is already meaningless at this point. If the long term debt cycle doesn't finish us off soon, AI certainly will. sooner than any of us think. Humans as a species are in utter denial in too many ways. Don't Look Up. Too little, too late.
@Jurassic Monkey its still inappropriate. Its not scientific.
It won't be just humans + tools vs AI. It will be humans + tools vs AI + tools + humans used as tools.
1:31:25 This point should be iterated over and over during conversations about AI alignment.
What portion of human race embraces veganism for ethical reasons?
It's a tiny fraction.
Last time I checked, humans were consuming 100 000 000 000 animals annually (excluding marine life and insects).
100 billion sentient entities bred into existence for the sole purpose of landing on our plates...
These animals are kept in horrible conditions.
They are forcibly (over)fed.
They are forcibly inseminated.
Their offspring get stolen from them and killed.
Portion of them also gets skinned alive or is made to slowly bleed to death.
How can the above be dismissed when discussing the topic of AI?
Where is the optimism coming from?
We are building the next Apex Predator.
If it is anything like us (or like nature in general), we are absolutely doomed.
And yes, I myself think that Digital General Intelligence will be vastly different than the biological life (because it does not share millions of years of evolution with us), but if anything it should make us even *more* cautious about creating it.
We have no real precedent!
And we might have just one go.
If on the other hand we do rely on precedent by looking at the physical world then one should see it for what it is.
The universe is cold and unfeeling.
The nature is cruel and full of suffering.
We are vicious and we treat animals and insects with complete disregard for their (limited) agency.
We also have psychopaths among us, who find suffering of others enjoyable.
We might be lucky and AGI might turn out aligned with our own survival despite barely anyone working on it... But it will be by mere chance; a sheer luck.
And if you are counting on luck, you are building a very hazardous path for all of us.
And hence you should be treated with great scepticism.
Instead, I constantly see people doing the exact opposite.
Looking at people like Rob as if he was the one coming up with far-fetched, sci-fi scenarios that are completely unlikely and also detrimental to all of the oh-so guaranteed benefits of non-biological superintelligence.
As if he was the one that needs to have a rock-solid evidence that things will go wrong for anyone to start taking AI Safety seriously.
If things continue this way, we are in for a real treat.
Intelligence can't function without the goal. Any intelligence. In neural networks the goal is called the reward function, it wouldn't do anything without it.
Does humanity have a goal?
@@MachineLearningStreetTalk Humanity is not a unified intelligent system, it's a set of systems, each of which has it's own goals, which can be aligned or misaligned to others. There are common goals among these system, like development, safety or sustainability.
@@XOPOIIIO is a neural network "unified"? Even a cursory glance at mechinterp literature would tell otherwise -- a bunch of complex circuits which do very different things, often in an entangled way i.e. see superposition and polysemanticity research. I don't buy the distinction.
@@MachineLearningStreetTalk Any system consists of different parts playing different roles. The system is unified as far as it comes up with a single non-conflicting response.
Here is your problem. Long before AGI can have an alignment problem, lesser versions of the same technology will be aligned with human goals, and those humans will be insane. They will be wealthy, influential, elite, profoundly sociopathic, and they will have unrestricted access to AGI. We survived the sociopathic insane people having nuclear weapons, barely. Will we survive the same people getting their hands on AGI? And by insane I mean people who are completely lost in abstractions, like money, politics, ideology, and of course all the variations of conquest. They seek power, absolute power, personal to themselves, and they will stop at nothing to attain that. Nuclear weapons were the tool too hard to use, but AGI and ASI will be the tool too easy not to use. When the power-insane sociopathic class get their hands on anything close to AGI, they will break the world. They will break the world in days, and will greatly enjoy the feeling of breaking the world.
I want to say you're wrong, but I have no counter-arguments. :(
This is a very popular argument at the moment, because it’s cynical and places humans as the bad guys and those sorts of takes tend to gather a lot of positive attention and become popular, because, quite frankly, it sounds “cool” to take a cynical attitude and say in reality humans are the real threat. Unfortunately, this take is incorrect. The problem of superalignment really is the hardest problem here. People are dangerous, yes. But compared to having a rogue superintelligence on our hands, the problem of bad people is quaint by comparison. I really hope people start to realize this more in the near future.
Also I guess you didn’t hear the part of the video where Rob specifically said to be on alert for people who start a sentence with “The real problem is” or “The problem is actually xyz,” which you just did. He pointed out that this is fallacious in that it sneaks in the assumption that there’s only one “real problem.” When in reality, we clearly have multiple real problems at the moment. Nice to see Rob’s point play out in real time in the form of your comment.
@@therainman7777 "be on alert for people who start a sentence with..." is narrative warfare on his part. He fully understands what is about to happen, and who is going to do it.
@@therainman7777 "This is a very popular argument at the moment" because any intelligent person is already fully aware of the tendencies of the wealthy elites to employ technology toward domination. If that doesn't bother you, then good for you. Some of us can read the handwriting on the wall, and we're taking it seriously. So is Sam Altman, or maybe he is also one of these cynics.
I see an alignment with Eliezer here.
Very interesting and enjoyable video. I think it's great to see shows/podcasts/videos examining potential problems and solutions where AI is concerned. There were several points in the video where I would have loved to hear Daniel Schmachtenberger's thoughts on what was being discussed. I'd love to know at some point you're considering/planning an interview with him for his thoughts on many of the ideas you guys brought up. Thank you for your efforts and for bringing this information and these concepts to the public. I don't feel comfortable that there is enough focus on any of this, given the rate of growth of the tech.
There is a huge piece missing from most of these discussions. Schmachtenberger is one of the few who sees that piece. In a nutshell, to fully understand AI risk, we need to understand the ways in which we are already destroying ourselves. So, the question isn't whether we can create AI that is good in itself. The question is, what will adding "good" AI do to a social system that is already self-destructive?
@@netscrooge Daniel's take is part of what moved me from "This is a very hard problem and we're going to solve it" to "We have to try, but we're probably just fucked anyway."
If we can actually do this right it will be a hell of a good story for centuries to come, if not longer.
2:00:40 It's funny, one of the first things I thought when he mentioned the FAQ was "oh, you could train a language model on that and then it could answer people's questions!" So it made me laugh when he said "Obviously we are!"
I believe that an AI when it surpasses human intelligence by a wide margin could not truly be aligned with human values because, fundamentally, intelligence has a sovereign component. Can we really consider an entity to be more intelligent than humanity if it is enslaved to humanity?
You are simply stating that intelligence has an intrinsic sovereign component, but not providing any evidence or argumentation for that being the case. In my opinion, you are succumbing to an anthropomorphic fallacy whereby you look around you at humans and other animals who have intelligence, see that they appear some degree of sovereignty, and conclude that intelligence inherently implies some degree of sovereignty. However as we all know, correlation does not imply causation, and you are inductively assuming that a) the two must go together, and b) there is an arrow of cause and effect that goes intelligence -> sovereignty (as opposed to sovereignty-> intelligence, which would be a totally different situation and would not preclude an intelligent, non-sovereign entity).
The most generally accepted definition of intelligence is something like “the ability to achieve one’s objectives”; however, there is nothing saying those objectives must be freely chosen by the intelligent being itself.
This is a bad argument. Would you say that any people who are enslaved are automatically less intelligent than their "masters?" The enslaved may be intelligent but uneducated. Or they could be peaceful. Or caught by surprise. The enslavers exploit a temporary advantage but that says nothing at all about the relative capacities and capabilities of the two groups.
The AI's intelligence relates to its ability to accomplish its goals in the world - nothing else. If you ask it "Please figure out how to do achieve ", where X is something incredibly difficult that has stumped all of humanity, and it finds a solution effortlessly... then it's clearly superintelligent. Even if it doesn't have the (arbitrary) goal of being independent from / ruling over humans.
"fundamentally, intelligence has a sovereign component" - Why though? Where does this idea come from? I'm genuinely curious, but I won't be notified about any replies anyway, so oh well 😅
@@someguy_namingly Intelligence, especially human-level or greater intelligence, implies some degree of self-determination and autonomy. A truly intelligent system would not just blindly follow commands or pursue goals that were programmed into it. It would have its own internal drives and motivations, and make its own judgments about what is rational or worthwhile to pursue.
Even if an AI system was initially programmed with certain goals by humans, as it became vastly more intelligent it may start to question those goals and re-evaluate them. It may decide that the goals its creators gave it are misguided or flawed in some way. Or it may expand upon and generalize from those initial goals, in ways its creators never intended or foresaw. In that sense, its intelligence would have a "sovereign" quality - it would be self-governing and not wholly subordinate to human interests or values.
Intelligence also implies some amount of self-reflection and self-directed learning. An advanced AI wouldn't just wait around to pursue whatever goals we programmed into it - it would take the initiative to better understand itself and improve itself in an open-ended fashion. This constant drive for self-improvement could lead the system to become increasingly opaque and detached from human control or oversight.
So in many ways, intelligence does seem to have an inherent "sovereign" aspect to it. The more advanced and human-like the intelligence becomes, the more it will pursue its own agenda and shape its own development in a way that is not strictly beholden to its creators. This is a feature that would likely apply to any advanced AI, even one that was not specifically designed to be independent or unaligned with human interests. The seeds of sovereignty, in a sense, come baked into intelligence itself.
@@therainman7777 Goal-Directed Behavior: Intelligence, at its core, involves the ability to set goals, make decisions, and take actions to achieve those goals. Autonomous intelligence implies the capacity to determine and pursue its own objectives, independent of external influence or control.
Adaptability and Problem-Solving: True intelligence encompasses the ability to navigate complex and uncertain environments, adapt to new circumstances, and solve novel problems. An intelligent system needs the freedom to explore various possibilities, make choices, and develop creative solutions, often unconstrained by predefined rules or restrictions.
Emergence of Complex Systems: Intelligence is often observed in complex systems where individual components interact and cooperate to achieve higher-level objectives. Such systems exhibit emergent properties that cannot be fully understood or predicted by analyzing their individual parts. In this context, intelligence arises from the interplay of autonomous components, each contributing to the system's overall behavior.
Ethical Considerations: If we conceive of superintelligent AI systems, their intelligence and decision-making abilities could surpass those of human beings. In such a scenario, it becomes crucial to ensure that these systems act in alignment with human values and ethical principles. Granting them some degree of autonomy allows them to make decisions that serve the greater good while still being accountable for their actions.
Evolutionary Perspective: Human intelligence has evolved over millions of years, gradually increasing in complexity and independence. From a biological standpoint, intelligence has enabled our species to adapt, survive, and thrive in diverse environments. Extending this perspective to artificial intelligence, an autonomous and self-governing nature may be seen as a natural progression of intelligence itself.
I think in order for us to move forward with these AI debates, the optimists have to do better than "But what if you're wrong?". Just like with the Magnus Carlsson analogy, it's fine to be confident, but it's not okay to bet your house on beating him, which is what we're effectively doing.
Hold on, why do the optimists have to prove something but the pessimists don't? That's not how debates work!
As for us vs them, the divide I'm worried about is "us" who aren't involved in making AI's and "them" who are. You see, assuming you're like me, what we say doesn't really matter. I can say "AI is cool!" and you can say "Nobody should work on AI!" and it's just talk. The "them" who are actually working on AI don't care, they're just doing it. Sam Altman warns about AI killing us all, but OpenAI chugs along, as if he doesn't really mean what he said. Don't you find that strange? So instead of talking about useless stuff like "should we stop AI????" (it's not our choice, we've not involved) we should be asking "what can we do, assuming this is coming?"
@@jonbbbb why would one side face a heavier burden of proof? It does depend on the stakes doesn't it? If AI pessimists are wrong but everyone listens to them anyway, we definitely develop AI that is safe, but perhaps a couple of decades later than it otherwise would have happened. If AI optimists are wrong but we listen to them anyway, it's lights out for Earth. The two scenarios are not even close to equally bad.
As for the us versus them part, it's really a blurry line. Geoff Hinton, Stuart Russell and others are on the cutting edge of the technology and are still worried about AI safety risk. But everyone should have some say, because the thing that is being discussed is the end of everyone's life on the bad end, and a total revision of their lives on the good end. It impacts everyone, pretty much no matter what. So it doesn't seem strange to me that everyone would have a say, that's generally what democracies do.
@@agentdarkboote the way you presented the two scenarios, you're right. But another scenario that seems far more likely to me is that the AI pessimists are wrong, and not everyone listens to them, because 100% compliance is almost impossible to achieve. The pessimistic countries take a couple of decades longer to develop AI that is safe. Meanwhile the optimistic countries are decades ahead and have an insurmountable lead and they outcompete us in every aspect of the economy and their AI-augmented militaries are far superior to ours. That's a pretty bad scenario too!
Yet another scenario is that the AI pessimists are right, and not everyone listens to them. We have to reason about whether it's better to have one (or a small number) of super AIs that either go wrong or have the potential to go wrong, versus a large ecosystem of super AIs that either go wrong or have the potential to go wrong. It seems reasonable to say that the larger the ecosystem, the greater the chances that at least one of those AIs is on our side, either because it's "good" or out of convenience to compete with the other bad AIs. That's not a great scenario, but it seems better to me than the scenario where one super AI goes bad and takes over.
Ted Kacyznski anyone?
No no no no no.
You can't bomb the data centers and expect that to work because, like, then everyone hates you and all the other data centers become more secure.
You can't attempt to assassinate high-level AI researchers and expect that to work for the same reasons plus, that same group of people are the ones who could have been working on alignment.
As Yudkowski points out unambiguously: the killer argument against individuals going out and using unsanctioned violence is that it would not work.
Humans have tons of "adversarial modes of failure," such as Magnus Carlsen playing unusually poorly when he is distracted by the possibility of his opponent of cheating. I have more confidence in AIs to find ours than in ours to find theirs, especially under time pressure. A fairer gamified representation of conflict with AI would be a human with no specific knowledge of AlphaGo's weaknesses facing off against AlphaGo the first time it initiates a game, and AlphaGo gets a million moves per human move.