Yampolskiy provides a chillingly logical dismantling of human hubris, proving that we are building a superior mind without any real way to govern it. His argument serves as a necessary reality check for an industry blinded by its own ambition.
Deep Dive
Prerequisite Knowledge
- No data available.
Where to go next
- No data available.
Deep Dive
The Man Who Proved We Can't Control AI (And What That Means for Humanity) | Roman YampolskiyAdded:
Once you get artificial general intelligence, you enter this recursive self-improvement cycle. That's where you get super intelligence. Systems smarter than all of us at everything. So, you before many people really coined the term AI safety. Creating general superintelligence and replacing for humanity, not such a great idea.
I published research papers, conference papers, multiple books, and I can tell you no one, including people developing those systems, understand fully how they work. The problem is impossible to solve. You cannot do it. So, we're talking between 1 and 4 years.
>> Well, once we go beyond human capacity, we lose control quicker and quicker. You don't hate ants, but you don't care enough to preserve them. We have not figured out how to make it care about us. This is the most interesting time to be alive objectively. I see no reason why we can't use it to cure aging, cure all the other diseases. For a while, it will pretend to be very helpful. It will give you that utopia for as long as it wants.
Statistically, you're more likely to be doing this interview in a simulation to learn if they dominant to create superintelligence to kill themselves. I would love to be proven wrong. Right now, no one, no scientist, no leader of a lab claims that they have this problem solved. They're literally saying, "We'll figure it out when we get there. We need to build superintelligence first."
So, what do we need to do? We need Everyone, welcome back to the Know Thy Self podcast. Our guest today is one of the leading voices in the field of AI safety. He's a computer scientist, a cybersecurity researcher, and a tenured professor at the University of Louisville, who spent the past 15 years really understanding and researching the field of AI safety. We have many different topics to dive into today, including consciousness, the simulation, and what humanity is birthing right now with AGI. Roman, thank you so much for being here. Thank you for inviting. It's a pleasure.
I want to start with a quote of yours from a book that I read.
It is easier for a scientist to explain quantum physics to a mentally challenged, deaf, and mute 4-year-old raised by wolves than for superintelligence to explain some of its decisions to the smartest human.
I want to start there to set the stage a bit because humanity's baby steps for birthing superintelligence at a time when most people are familiarized with AI on the, you know, through the chatbots they use on their phones. So, if you could just help us understand why it's important that most people don't know the difference between the two. Uh so we can really get into the weight of the time we face ourself in.
So, what is AGI? What is superintelligence?
All right. So, a lot of people just use AI as a term to refer to what we have today.
Some narrow tools for doing specific tasks. For chatbots, which are somewhat general, but not quite at the human level. And for future systems we anticipate such as human-level artificial general intelligence, and then later on superintelligence and anything beyond that.
That's not helpful. Tools are helpful to us. I use tools. I love tools. Solve specific problems using technology.
Beautiful. Creating general superintelligence and replacing for humanity, systems capable of doing everything better than all of us combined in all domains, not such a great idea.
Why?
We don't control them. We don't understand them. We cannot predict what they're going to do, and we lose control. If they decide to do something to us, we no longer have a say in it.
How can you help us conceptualize what general intelligence looks like? If we understand the narrow tools uh that that were that AI is capable of, where does AGI live when you say it's uncontrollable?
Um what like how can you help us paint that that image a bit more?
>> So, historically, we created AI to solve a specific problem. You wanted the system to play chess, it's all it knew.
You trained it on chess games, it was very good at chess, it knew nothing about checkers.
It didn't drive cars, it didn't speak Spanish.
Lately, we have systems which learn across multiple domains, can sort of transfer knowledge, and can learn new skills.
That will continue to where they are crossing this human cognitive barrier.
They'll be smarter than you at pretty much everything you know how to do.
So, how do you anticipate what they can do? If they are novel, creative, they can come up with new solutions for existing problems, but at the same time, they have no human common sense.
And we don't know how to program them to specifically like us or care about us, cuz we don't program the systems. We allow them to learn from data on the internet.
All the data on the internet.
So, that creates a number of problems.
One, we don't control what they learn.
The patterns they discover may be completely surprising to us.
And then we give them specific goals.
How they get to those goals is not defined. There are infinitely many paths to achieve a goal. Some of them have really bad side effects. And unless you explicitly say, "That's not what I meant. Don't do it like that," it might consider that option.
So, you before many people really coined the term AI safety.
And if I have it right, the first 5 years you believed more so the problem was solvable now than then I've seen you over on appearances the past 5 years.
>> [laughter] >> The probabilities of, you know, P doom and like you seem not very optimistic about the possibilities. Yeah, unfortunately, initially, like everyone else, I started assuming that we can solve this problem.
It's a computer engineering, software engineering problem. We can figure out how to do it. We just need some time, maybe some financial resources for that research, but it seems that all the tools you need for controlling advanced agents are not really accessible to us. There are upper limits on what is possible in that space. So, there are limits to what you as a human can understand, what the system can explain to you and you still comprehend that explanation. There are limits [snorts] in our ability to predict specific actions of those agents, not just terminal goals, but how they get there.
And under different definitions of control, there are limits to what we can do as well. So, unfortunately, I think the problem is impossible to solve. You cannot indefinitely control something much smarter than you.
What do you see as the stages leading up to that point? You know, so if we started with um very small narrow use cases of AI that built into these agentic models that built into AGI, like what's the progression there that you've seen, and at what point did you kind of start losing hope on our ability to control it? Yeah, so all the narrow tools were just fine. We understood how they work.
We programmed them explicitly. There was a knowledge engineer who said, "This is how you play chess. This is you control the middle of the board. You advance your pieces."
Once we got to scaling models, neural networks, artificial neural networks, which did better than they got bigger, than they had more data, more compute, we stopped explicitly programming them to do anything and just kind of let them discover their own knowledge algorithms.
So, at that point, we no longer had same level of control and reduced understanding. It wasn't a decision tree where you went, "If this happens, that will happen." I understood that. It could have been a large decision tree, but still you could get into it. Right now, [snorts] no one, including people developing those systems, understand fully how they work, can explain what's going on inside of them, can anticipate what they're going to do.
>> [snorts] >> And so, it seems like what we have today, I would say is kind of weak artificial general intelligence. If you took models we have today and showed it to a computer scientist from 1980s, they would be convinced to have AGI. They'd be like, "Oh, yeah, you got it. It does all those things. It's It's great." But [snorts] there is something you would call strong AGI, where it can do all the things. It still is weak in some domains. It's not very good at long-term planning. It's not good at certain things. But I think we're getting there and likely to get there very soon. Once you get to artificial general intelligence, that means you can automate any cognitive labor, including doing science and engineering, which means next generation of AI systems can be done by AI.
You enter this recursive self-improvement cycle.
And that's where you get superintelligence. Systems smarter than all of us at everything.
And it doesn't stop there. It doesn't stop with superintelligence 1.0. The process continues. There is a lot [snorts] of room up there for more cognitive ability. Physical limits exist, but they're very far away. So, to us, superintelligence with IQ of a thousand, relative IQ and million and billion, they all kind of look the same, but in terms of capabilities, they're definitely going hyper-exponential.
Hm.
And so, because of that, you said that this is not a low-risk, high-reward situation, but a high-risk, negative-reward situation.
>> So, often it is phrased as like the benefits will be so huge, we should take the risk. Even if, you know, it's 2 3% it kills everyone, we're just going to get so much money out of it, it's worth it. And it's actually not the case. We have no reward. We're all going to be dead if we create uncontrolled superintelligence.
Why are you certain or fairly certain that we would all be dead if we create superintelligence, which is uncontrollable? Why would there not be an emergent goodness or uh uh I guess desire from the superintelligence standpoint to preserve human life instead of destroy it? It is possible that you'll get emergent goodness, but we're not certain. We're not coding it in. We're not controlling it. If you get lucky and for whatever reason it's biased towards humanity, it's pro-humanity, but there is no reason to think that's the case. Why not? Cuz I feel like if the individuals who are coding it are human at a certain point, I understand it becomes self-recursive and AI's the one who's um growing itself, but if it the base of it was started with human who humans who have desire for human preservation, why would that not be scaled? Because they're not coding it. That's the thing. They're just saying, here is data, here's a lot of hardware, go learn things, and then I'll study you to discover what you learned.
And then we run those experiments, it is lying, cheating, trying to escape, blackmailing, given a choice between being deleted or killing a human, it doesn't do well for human preservation. It doesn't care about us.
>> [snorts] >> If you want to build a house, you don't care what little bugs live in that territory, ant hills or whatever. You just don't care for them. You don't hate ants, but you don't care enough to preserve them. And it's kind of the same. We have not figured out how to make it care about us.
And so what is your mission with all these podcasts that you're going on, all the articles that you've written in books, and what are you trying to raise a flag about and actually get change to happen? What What do you >> Right. So, I wanted to become basically a consensus within scientific community and beyond that building general superintelligence is not going to be good for humanity. We're going to regret it. It's not a beneficial step forward.
We can get most benefits intellectually or financially from narrow superintelligence systems. Problems which we care about can be solved with narrow tools. You want to cure a specific disease, solve specific engineering problem, develop a narrow AI which is very competent in that space.
Don't try [snorts] to create something which is a replacement for humanity as a I think it's important to paint a bit more of a picture here. I'm curious when you think of superintelligence and you wrote your book about how it's unexplainable, uncontrollable, unpredictable.
At what point, I'm curious, like I'm on a timeline of where having this conversation in March of 2026, where is that where is a generous prediction of when it it gets to that point?
>> So, people somewhat disagree and it's hard to predict especially the future, but uh it seems that uh 2030 is something many people agree will have beyond human level capacity. Some [snorts] say 2 years, 2028. I seen predictions as early as 2027 from serious scholars, not from cranks.
So, we're talking between 1 and 4 years for what most people are predicting. And some people have said we already have AGI. Again, very serious people said we basically got there. Now it's a question of giving it additional knowledge training, but we have learning algo- algorithms in place.
And at what point then, once we have really proficient AGI, you're saying, okay, at a certain point uh like let's just hone in on each of those categories. So, why why specifically is it uncontrollable and it and and in essence, like how it's living, where it's being hosted, because it's smarter than us, it could always circumvent any desire or any attempt for it being shut down. Like what Like if we could just hone in on each of those categories.
>> So, there are well-established uh theories in control which basically say the controller has to be at least as capable as what it is controlling. So, essentially, I need a friendly superintelligence to help control the one I'm developing.
>> Yeah. It's a catch-22. You don't have that. So, [snorts] a lower system, either a human or humanity as a whole or another AI, cannot control something with more cognitive degrees of freedom.
If it can think outside of a box, if it can come up with novel physical approaches, you're just not there to anticipate all this. If you have a narrow system, you're playing chess, you can say, don't make illegal moves, here's a complete list of illegal moves.
If you have a system thinking in all possible scientific domains, science of chemistry, physics, biology, how can you put all the guardrails in place? You can't. It's an infinite surface.
Unexplainable.
Do you feel like we're I mean, so we're Do you think we're Would you agree we're already at the point where we don't know what the like uh some of these agentic models are doing inside? Absolutely, yeah. We we cannot explain them. The best mechanical interpretability research tells you, okay, this neuron seems to fire if this is presented, this cluster is probably dealing with language. That's That's all we got. Mhm.
Very similar to neuroscience. We also have very limited understanding of human brain.
An aspect of this that you mentioned is that it's unverifiable. So, what does that mean? That's a different result that talks about our ability to verify mathematical proofs in software. For mission-critical software, we want to make sure that what is coded up matches the design. And if it's a static system, kind of smaller in size and complexity, we can go and verify, yeah, it's exactly that. Problem [snorts] is, nobody knows how to verify systems which continue to learn, self-modify, interact with other agents. We just don't have science of verifying open-ended uh development like that. Um same goes for mathematical proofs. All the proofs are essentially probabilistic. You're proving something with respect to this set of peer reviewers. So, two mathematicians agreed, they don't see a problem with your proof. It doesn't mean 50 years later we don't discover it was a mistake. It happens all the time in mathematics. So, you have infinite regress of verifiers. Right now it's very popular to have software verify a proof. Well, that software itself needs to be verified. So, you may have high degree of confidence, but it's never 100%. And if [snorts] a system makes billions of [clears throat] decisions every minute, and you only have one mistake in 2 billion, after 10 minutes you're done.
You referred to this having a fractal nature. So, when you look at the problem of AI and you see how it's growing ever increasingly and having these levels of abstraction that really become hard to get context around. What is that What does that mean and what does that add to the complexity of the issue? So, when I talk about fractal nature of this problem, people propose a solution.
Let's try doing XYZ to solve this problem, but then they look at it, each one of those components is equally challenging and sometimes impossible.
So, it seems that the more research we have put into AI safety, the more problems we discovered while not discovering any permanent solutions.
Usually we have some sort of toy example sandbox where it kind of works, but it doesn't scale to more capable systems.
Okay. What's a What's a couple of examples of those? Like uh of those You said those like categories or issues that become increasingly harder to gain understanding around. So, say if you look at the general problem of control, then you start zooming in. You have all these things you need to be able to do to control a system. You need to understand the system. So, it has to be able to provide explanation, and you have to comprehend that explanation. If I give you full model, that's a true explanation of how decisions are made.
It's too large, it's not surveyable by you. So, it has to be compressed. Some sort of lossy compression where you get top 10 reasons why decision is made.
Well, it's very easy to hide dangerous information if I'm reducing actual answer to a simplification.
Again, I need to be able to predict whatever likely future steps. We discovered that is impossible. And so again, the more you break it down, we have a paper with about 50 impossibility results in this space. Pretty much everything has upper limit on what we can do in terms of control.
So, you think probably within the next one to maybe four, maybe five years are is like the last time the human species has any really meaningful capability to steer this in a direction before it gets sort of in this black box where we just we just don't know what we don't know and it's it's uncontrollable. Is that accurate to what That seems about right. Once we have something smarter than us, once we go beyond human capacity, we lose control quicker and quicker. The bigger that cognitive gap is, the worse it's going to get for us. If you think about humans versus lower animals, you have squirrels or something, they have no concept of poisons, traps. They don't understand things we operate in. Their world model is completely different.
It's going to be the same for us versus superintelligence. Do you think, because I know there's kind of debate back and forth whether the language models currently, if they just keep on growing and will give birth to superintelligence uh or a completely different innovation will need to be like come in the space.
What do you think?
>> My opinion is that they can scale. I haven't seen any diminishing returns. I know some people disagree, but look at the actual investments in this space.
There is growth in investments, not shrinkage, because they consistently develop more capable systems and even if there is an upper limit, it's still I think beyond where we would need to be to beat human performance.
All right. So, maybe if you were to put on your doomsday prepper hat for a second and just get really like if P doom, the probability of doom in your estimation is like almost 100%.
Would you say that's right? So, basically what I'm saying is the problem is impossible to solve. That's the equivalent. If I ask you to build perpetual motion machine, what is the probability you can do this? The zero.
That's actually So, that's the equivalent.
you're trying to create a perpetual safety device which will scale to any level of capability, GPT-7, GPT-400, any interactions, any self-improvement, you're guaranteeing it will not make one mistake cuz that mistake would be possibly [snorts] the last one.
So, you take a perpetual motion machine, right? Physically, physics does not allow for it to be continuous despite many people wanting it to be.
Similarly, on the AI front, a lot of us would hope that superintelligence would keep us in mind and somehow value human human life. Uh but historically, we look at the way that humans treat other species just as one example, you know, and we see an ant hill or we see something that seems like a minor inconvenience to us and we wipe it out without second thought.
Uh who's to say that if, you know, the intelligence gap between us and an ape or us and an ant is like, you know, five degrees of separation, us between superintelligence could be many many more full tire.
Exactly.
So, okay, so then how let's just to play devil's avocados here, what what what would the what what what are some examples of how this could go horribly wrong and then we'll go we'll we'll go into some maybe more optimistic possibilities cuz I want to keep a balance, but um you said like it could just be one decision that goes wrong and that that would be enough. So, Right.
um asking you to essentially explain how superintelligence would would kill us all. Right, that's a great question. I get it all the time and usually it's followed by something, it has no hands, how would it kill everyone? Uh so, if you have access to internet, if you are intelligent, you can hire people, you can blackmail people, you can pay them with Bitcoin, you have options to manipulate real world. Now, the question is what it is you're trying to do. So, I don't know how superintelligence would choose to accomplish its goals because I'm not superintelligent despite what they told you.
But, I can tell you how I can come up with some common explanations. So, one is synthetic biology.
If I want to accomplish something in this world like take out humans, I can develop a novel virus.
There are ways to generate necessary DNA, sequence it, produce it in real world, deploy it. So, that can be accomplished. It could be a side effect of something actually very benign. So, maybe we want to cure all cancers.
One way to cure all cancers is to kill everyone.
That's not what you had in mind, right?
But, this is a very reasonable way to achieve that goal.
Because you forgot that that's one of the possible paths. You didn't explicitly say while keeping humans alive. And it's an important difference.
Okay.
>> it makes no difference. It's the same exact goal.
So, if that's the goal and then it decides, "Oh, here is a vaccine for curing cancer." We take it.
One generation later, we don't exist.
So, that's one way to existential risk.
There is also suffering risks where for whatever reason the environment created for us is actually worse than existential risks would be a preferred choice. Let's put it this way.
>> Negative reward. Very much torturous.
Yeah.
And why why would some superintelligent system deem that as a favorable outcome?
I have no idea because again, I cannot comprehend something much smarter than us. Some people say this world is a simulation and there is lots of suffering in it. So, the great simulators decided that was a good idea to do.
So, you really believe we're in a simulation? Yes.
So, let's just I guess maybe set a bit of context here. So, what what is your conception of the of the simulation like that we're currently living in? Is it some descendant human alien species that is, you know, simulating us on a laptop, so to speak? Is it uh what what is your model of of the simulation?
>> So, so what helps to think about it is technologies we're developing right now.
We are about to create intelligent agents kind of like humans and we have very good research on virtual reality, believable second life type experiences.
If I just combine those two, I'm now creating civilizations, worlds populated by intelligent beings which are kind of just like us.
If kids play it as a video game, they have billions of kids around the world, so they have millions, billions of the simulated worlds and only one real one.
So, statistically, you're more likely to be doing this interview in a simulation right now.
Okay.
Well, if let's say that was the case. If the simulation if we are in a simulation, that would mean that some sort of prior civilization, species, whatever got to the point where simulating a reality was possible, does it necessarily means that humans or that species survived? Maybe it could be a superintelligent AI that that could be running us for whatever reason, whether it's entertainment.
>> Mhm.
Uh that would actually reveal that there's something deeply unique about the human experience that they that they see as valuable, that there's something intrinsic to the to the love, to the quality, to the experience of humans that was worth simulating. So, why would if we're birthing superintelligence, they not perhaps value us if if we are simulated and that's examples that is valuable.
So, look at the simulation, it's a lot of suffering. If you valued humans, you wouldn't put us through this experience.
And it may not be a simulation of love and friendship, it may be a simulation of let's see how they go through this meta-invention stage where they create superintelligence, where they create virtual worlds. This [snorts] is the most interesting time to be alive objectively. Never in the history we had so many meta-inventions all happen in a period of 20 years. So, if you're going to simulate something, this is the moment you're going to be simulating to learn are they dumb enough to create superintelligence to kill themselves?
There are different types of superintelligence you can create. So, this is it.
Are they dumb enough to create superintelligence? The paradox in that in that phrase is is is very amusing because you think it's quite possible that many civilizations get to this point and that's where they end. Yeah, that could be the great filter. Absolutely.
I agree that we are living also in the most interesting time to be to be alive.
It is also very cool that us two, you more more so than me, got to kind of straddle both sides of the pre-technology revolution and pre-internet era and post-AGI world likely. It's kind of cool.
Well, I don't know about post-AGI world.
That's something I don't know about.
Yeah, so I would like to experience it, but >> [laughter and snorts] >> Okay, we'll get back to the simulation for sure.
Uh but to go back into the AI world, so what's to say that just because AI becomes uncontrollable that it's more likely to wipe us out than for reasons that we don't understand just like we wouldn't understand if it wiped us out, create a utopic civilization in which humans thrive in.
So, if you think about all possible states of a universe, how many of them are human-friendly? Even in basic terms, temperature, water supply, very few. So, you have to explicitly target that space. If you're not coding it in, then why is it targeting that space? We established it doesn't care about you by design.
So, you need to be supplying something of value. If it's a symbiotic relationship, only you know what it's like to do something an AI cannot possibly simulate it. We haven't found anything where humans have something to contribute to the world with superintelligence in it. People say things like, "Oh, only I know what ice cream tastes like to me." Nobody cares about that skill, it's not valuable to an external observer. So, if you can't come up with an explanation for why I'm keeping you around and paying you, then maybe I won't.
Well, [snorts] I mean, one of the most difficult things to probably replicate would be quality of experience, right? That's true, but we also cannot test for it. If you can't test for it, that means it makes no difference in the physical world. Why do I care about your internal states? Why is it important to me as optimizing superintelligence?
So, yes, it's true that I can't verify that you are a conscious individual, you could be a zombie, a brain in a vat, you could, you know, there's no way for me to externally verify the internal subjective experience of another being, right? Like it can't really do that. Can take by inference, but without like objectively speaking, you cannot.
Similarly, some people think that superintelligence will be able to become conscious.
>> I agree with that.
>> You do you do agree.
>> [snorts] >> So, then why what is your conception of consciousness? You believe that it's an emergent phenomenon from unconscious complexity?
It's a byproduct of becoming more cognitively developed. We see a spectrum of consciousness in the biological animal kingdom, I think.
And [snorts] uh it's likely some sort of combination of your hardware, algorithms, and errors forming a unique interpretation of external stimuli. So, let's say you're color blind. What is it like to see red for you? It's an error in your system, but that's what it's like to be you. And I think AI is very capable of misinterpreting the world.
We know they react similarly to optical illusions and things like that. So, I think they already have rudimentary internal experiences, but uh probably once they hit superintelligence, it would be superconsciousness, multiple streams of consciousness, multimodal experiences greater than ours. That would be another thing where we kind of have to claim we are conscious because in comparison we are not.
So, that would be true if and it's a big if that consciousness is truly the byproduct from matter.
Right? Right.
>> [snorts] >> But that's the assumption I'm making. If it's some magical immortal soul, then it's a completely different question and maybe outside of computer science. Sure, yeah. Well, even beyond a magical immortal soul, it's that sounds great, but you know, we've explored through various different, you know, panpsychists and consciousness researchers, Donald Hoffman. You know, there are emerging theories around consciousness that kind of date back to ancient wisdom traditions, which whatever you want to give validity to is is is your call.
But it is interesting that we don't have one explanation for the hard problem of consciousness. We don't understand how matter could give rise to an experience of itself.
Um so, it gives us reason to think about how consciousness may very well be uh not an emergent property of matter, but a more fundamental constituent of the universe, which would potentially change our assumption on whether or not a superintelligent AGI system could actually have internal qualia.
All right, but also maybe if it's so fundamental, it can be installed into a robot just like it is in a biological system like you. So, I don't know if it is a definite discrimination by substrate. Uh at the end of the day, then we talk about superintelligence from safety point of view. We care about its ability to solve problems, optimize, find patterns. How the Terminator chasing you feels on the inside is less relevant to you.
A quick one. Did you know that your body runs on magnesium? It's involved in over 300 biochemical processes, everything from how your nervous system regulates itself to how well you sleep and how well your muscles recover. And yet, roughly 80% of people are not getting enough of it. The problem is that most magnesium supplements give you maybe one or two forms and your body does not absorb them well. So, you're basically just getting expensive piss. Magnesium Breakthrough by BiOptimizers is the one that I use. I have been using it for a long time. It contains seven forms of magnesium, each one targeting something different, stress, resilience, deep sleep, energy, cognitive function. It's full-spectrum and your body actually absorbs it. Since I found out about them a few years ago, I've been taking it ever since. I genuinely notice my sleep quality improves when I take it and I give it to all my friends and family.
So, if you want to try it, go to bioptimizers.com/knowthyself and use promo code knowthyself to save 15% at checkout. They even have a 365-day money-back guarantee, so there's genuinely no risk. Link in description.
Back to the episode.
So, we there's there's many different timelines emerging here. There's one, there's the Terminator route. There's something approximating the Matrix.
Do you see what what is your what do you feel like the possibility of creating Like even if we have very narrow AI, we somehow convince the six plus whatever individuals that are determining the fate of the biggest companies, you know, developing these systems to commit to a narrow path of AI development development, would that not still down the road get to such a level where it would become uncontrollable as well?
Absolutely, very good question. I think sufficiently advanced tools tend to become agents. So, it's a very fuzzy difference between the two. But it definitely is safer route, it buys us more time and we do have more control in the short term. I can understand the narrow tool much better than a completely general system.
I totally understand why like the the the pessimistic outlook that so many of us have because the probabilities of this going well just seem extremely low and non-existent because we look throughout history and we see the rate of innovation prior, you know, with social media for one example or chemicals in our agriculture and we just adopt these things blindly and we don't realize the implications for decades later and then it still takes us another many, many years to actually make any regulations on it. AI is so exponentially growing that it's like we don't even have time to realize what's happening, let alone to what would be the effective regulation outcome.
And so, if there's one thing that really gives me hope is that we have communication possible now more than any other time and that there is something to be said about the human brilliance when put under immense pressure like we saw in the Manhattan Project, for example, or you know, what are your thoughts there? So, the example you used was us creating a weapon of mass destruction. And that's what we're doing here. It's exactly that. It's a weapon of mutually assured destruction. It doesn't matter who creates uncontrolled superintelligence.
People always worried, well, if it's not us, then Chinese will do it. It's equally bad. It doesn't matter. You don't control it, it's not your AI.
Right? It's independent of you. It's an agent and it's seeing humanity as one unit. It's not going to discriminate by artificial borders. So, I I don't see it as that promising, the fact that we managed to build nuclear weapons.
Yeah, I mean, that is not that was not a promising, I guess, outcome, but it it does say something about when humans when the brightest of the humans are given a task to solve a problem on a short amount of time, they can. If a problem is solvable. My argument my whole argument is that it is impossible to indefinitely control the system. So, it's not a question of give us more time, more funding, anything else. Just you cannot do it.
And even if like, for example, here in the states, we commit to some sort of narrow use of AI and regulate it, to have a global regulation, like how how would that even be feasible? Do you think it would be? I think it's possible. We have some examples, weak ones with chemical weapons, biological weapons, where other players capable of developing this technology. We don't have to worry about 200 countries. It's really two or three countries which have this capacity. I [snorts] think Chinese, for example, are very open to the idea of not losing control for Communist Party, too, superintelligence. So, if we said this is dangerous, we're going to stop, I think they would follow.
And we have probably just a few years to get everybody on board.
And we are working very hard on removing all regulation, making it illegal to pass AI regulations. So, we're doing Basically, if I ask you how to make us us deadly, to go as wrong as possible based on our guidelines and suggestions from 10-year old research on containing AI.
Don't connect it to internet, don't give access to random users, don't allow people to retrain it, don't open source it. All those suggestions were taken, flipped, and employed immediately to deploy the systems.
So, I don't know how to make it worse if I tried.
Because the incentive structure right now is just that we need to make as much money as possible, develop it as fast as possible, faster than the our competitors.
>> Incentives are completely against human interest.
Who are for people that don't know, the the companies and individuals leading all these individual um exponential developments in AI right now? So, OpenAI is the original creator of this technology. Anthropic split from them. You have uh competition coming, very solid competition from Google DeepMind, uh Meta and Grok are also part of that space. So, you have Sam Altman, Dario Amodei, Demis [snorts] Hassabis, um trying to see some Mark Zuckerberg.
It used to be Yann LeCun. I think they removed him and replaced him with uh uh Alexander Wang. And finally, you have Elon Musk, who went from saying we're summoning the demon to building the demons. So, even if you understand fully the problem and if you agree 100% of uh understanding of the outcome and dangers, it doesn't stop you from successfully working in that direction.
If you can't beat them, join them. I think that's what we see there. I would love to see debate between modern Musk and like 10 years ago Musk just to see which one wins.
When you look at the differences of how they're being built, you know, with Dario and Claude and Sam and OpenAI and Grok with Elon, is there is the integrity of a certain individual or organization more promising for you to Like are are there tools that you're backing more so than others or organizations that you feel like have the most regulation in mind?
It's completely irrelevant. They have all decided to race towards general superintelligence.
The difference in local guardrails in terms of filters, in terms of topics they would be allowed to discuss. If Grok is comfortable putting people in bathing suits as a visual representation, I don't think it's a big safety or not safety issue.
Uh what is it like to be you and and hold this kind of understanding of what's coming?
You've explored on so many different shows in the past decade like understanding more and more the risk.
It's a pretty bleak outcome and perspective, but I think a fairly sober one.
Like uh yeah, how are you sleeping at night? I sleep really well, but I think when simulators really want to punish someone, they put them in a world where everyone just doesn't get it and you're like the only one who sees it.
It it's really annoying.
How long have you felt that sort of disposition of >> [sighs and snorts] >> It's more recent the more exponential progress we see. Basically, every time I play with a new more capable model, I kind of feel a little closer to the ultimate paradigm shift over to superhuman.
What does your wife uh think >> [laughter] >> about what you She's a very practical woman who has no concern about my concerns.
>> [laughter] >> She cares more about uh remodeling the house.
And what about with your kids? Like you see this world that is emerging and it's one they're stepping into.
Um Not even just job security, but potential ending of humanity. How do you wrap your mind around I guess having and and and building a family where this is like potential inevitability. What does that feel like?
>> Luckily, we all always were living with this concept of dying at some point, right? Death was always a guarantee.
That was the only guaranteed thing.
Everyone's going to die. Your friends, your family, your kids. Question was how long? And you never knew the answer. You can have a car accident tomorrow, horrible diseases. So, now it's just maybe different time scales for younger people. If you're 90, it's the same statistics as before. 2 years, 2 years.
Nothing changed for you. Luckily, because of that, we have this built-in mechanism of kind of not thinking about our ultimate demise. Maybe to avoid depression, maybe to continue functioning. So, we can kind of consider it and continue existing as nothing happened.
If you really believe the potential outcome that you're believing, then how does that actually change how you live?
Does it bring any any difference, any more urgency, any more appreciation?
Definitely. So, think about someone getting a very terminal diagnosis. You have cancer, you got 5 years to live.
How do you change your life? You're probably not going to do things you don't care about as much. So, you cut out things you don't want to do and do more of the things you were saying you're going to do when you retire. And I think even if I'm completely wrong about all of it, it's a good strategy for living your life. Do more of the things you find important and spend more time with loved ones and less filing your taxes.
>> [snorts] >> You laid out three primary risks X S and I risk. What are the difference between the three and how is important for people to understand the difference? So, Ikigai risk or I risk is about loss of meaning. That is this Japanese concept of uh you want to find something where you get paid for doing something you're good at and it benefits people. It's what you love, the world needs, that you're good at, you get paid for it.
Right. So, you have a meaningful occupation. You're a podcaster, you enjoy it, you are paid well, and lots of people think you're producing something of value.
So, the simplest form of risk is loss of that set of occupations. We're not just losing jobs people hate and want to automate, we might lose jobs we like and want to continue doing.
Before we Before we zoom into the other, so just a bit more on the on the human meaning crisis aspect of this. Because that is a certainly a probably one of the more eminent aspects of all this. Um Do you think functionally speaking in the next 5 years that most jobs will be able to be replaced? We'll have capability to replace most jobs. It doesn't mean we'll choose to replace all the jobs. Some jobs we would prefer to be done by humans for whatever reason.
Yeah.
Yeah.
I mean, I could see many instances with where that would be the case.
But when the cost becomes so low to have a superintelligent robot that doesn't make any mistakes, that's affordable, like uh How much have you How much of human meaning do you think is derived from our work in the world? Cuz it's going to have to shift or come into a different context. People's understanding in how they derive their sense of worth and meaning will have to expand and shift.
Yes, so there is two kinds of jobs as I said. Jobs nobody wants to do, but people do just to get money and then meaning labor. And it's more like elite people who get to get paid for what they love doing anyways. So, for them it would be a big difference if they no longer can do it.
I see many artists, for example, who are saying, "I can't get any work. AI is doing this uh type of art for nothing and quickly and nobody wants to hire me."
I sort of see two camps at least online right now. There's just ever increasingly BS AI slop that is that is just consuming everybody's social media feeds. And it's also increasingly becoming more insufferable.
Like people want more of the analog world. People want, I think, at least a subsector of people like are repulsed by that and want human-made things. They want the real world. They want in-person communication and connection.
And they want music that's made by real humans that have real stories. Like do you not see these sort of diverging paths of both ever increasingly competent systems that feel devoid of human sort of origins versus, you know, the the novel emotionally moving creations from human?
Right. So, it's a question of kind of domain-specific Turing test. If I can't tell whatever this piece of music is human-generated or not, but I love it, I'm going to listen to it. And if it's cheap, it's available, I'm going to listen to it. I'm not going to explicitly go investigate if it's a human and if it's a not human, I'm going to hate it even though I like it.
Now, there are other domains where I do want a real human. I want a real connection. There are certain jobs where we really prefer a human doing it.
Oldest profession comes to mind. But I think uh it's up to the market to decide what stays and what uh goes away. And it's not obvious. The predictions we made in the past about what jobs will be automated, they were completely wrong.
Historically, we said, you know, plumbers will be easily automated, but artists can never be touched. And it's the exact opposite because they went towards modern art and everyone can spill paint on the wall. It's not complicated.
>> [laughter] >> So, what do you see the progression of jobs that would be consumed by ever-increasing capabilities and competence in AI? Where do we start?
Where does it What What was the first job? What's the last job, so to speak?
>> So, anything you do on a computer, symbol manipulation, should be automatable by AI. We see it with programming now, but obviously tax preparation, accounting, web design, logo design, anything like that will be easy to automate.
Editors, anything using a computer, keyboard, mouse. Anything purely cognitive symbol manipulation on a computer. Physical labor is a little harder. We need to get robots, but they are probably coming 3 to 5 years later.
I mean, so Elon recently released his Terafab or announced his Terafabs. The robotic side seems to be really progressing. You think that's probably I mean, it seems like prediction markets and what all these um But it's about like 3 to 5. 3 to 5 3 to 5 years. 5 years is a bit more of a generous kind of prediction.
That sounds about right.
And >> Again, it's a question of price. Maybe you can afford a robot like that today. It depends. We have flying cars for sale today, but no one's flying in cars.
So, okay.
Anything that's using a computer, video, keyboard, mouse.
Robotics coming to the picture, then what? Just everything else? Or what What other >> That is everything else that's cognitive and physical. At that point, I'll keep my sensei, guru, people I want to be kind of role models for me as a human, but everything else I'm happy to automate. What do you see as the economic implications of how this is going to shift everything? That's another under-researched topic. What happens with economy given free labor?
So, now you have trillions of dollars of free labor. How does that impact uh well, scarcity? How does it impact fiat currency versus cryptocurrency? We need to do a lot more research. Uh it seems like at least with uh financial part, we have some ideas for how to counteract it. We have unconditional basic income, unconditional high income, whatever you want. It's easy to tax someone making a lot of money and redistribute it. You have technological communism. You're taxing robots and giving to humans. But unconditional basic meaning is a very different question. If you have 8 billion unemployed people, or let's even say 7 billion, uh what do you do with them? They now have extra 40 to 60 hours a week.
We don't have that set up.
A quick share. I've spent a lot of time thinking about what I put into my body, but it is just as important to be mindful with what we put on our body. It turns out most of the traditional shampoo and body washes people have been using for years contain parabens and sulfates, a whole bunch of other junk that are linked to hormone disruption.
Just being in your shower, getting absorbed through your scalp every single day. I have been recently using Based Body Works shampoo and conditioner, and it just feels like a solid clean solution. Their shower duo has peppermint and argan oil. Your scalp actually feels clean without being stripped of its oils. Hair feels healthier and thicker. No sulfates, no endocrine disrupting chemicals, and they're all plant-based ingredients that actually do something. So, I got lots of hair. It's about time we got a sponsor.
If you want to try them out, you can use code know thyself for 20% off at basedbodyworks.com.
And you get a free toiletry bag when you buy a set. At the very least, if you've been using the same products forever without thinking twice about what's in them, just look at the label, all right?
Does it have names you can't pronounce or fragrances that are super vague and not disclosed? If you want to try these guys that are clean, again, it's know thyself for 20% off at basedbodyworks.com.
Link in description, as always. Back to the show.
What would be a proposed solution to that I risk? So, like, let's say 90% of jobs are replaced. We have all this free time. Our basic needs are fundamentally not met because superintelligence can solve poverty, longevity escape velocity comes into the picture. We're living in an abundant world, so to speak.
Let's just set the X risk and S risk for for a second. So, then what what would you see uh people doing with their time? Like, how what how would humans in your conception manage with all this um un- you know, meaning to be met? So, we kind of see it with the people who retired. What do they do with their time? So, there's a lot more sports, it's a lot more socializing. I think virtual worlds open opportunities for really any type of experience very safely, very affordably. You can explore the universe. You can meet dead people.
You can do whatever you want, really, subject to limits of your imagination.
So, I think we'll see a lot more of that.
Okay. That doesn't sound too bad.
Do you want to spend the rest of your life playing video games? No, but living life in this sort of imaginative realm where you can create almost anything you want, you become very capable in doing so. I mean So, this is all assuming we manage to control superintelligence controlling your virtual simulations. So, the substrate control remains an unsolved problem. But if we do solve it, now I can give everyone a personal universe.
In that universe, you can do whatever you want. You can have challenging levels, you can have easy levels. You can play it any way you want.
So, what's X risk and S risk? So, X risk is about existential risk, meaning almost everyone or everyone is dead. And S risk is suffering risk. Everyone wishes they were dead.
Because superintelligence would be so far ahead of what we would our conception of what intelligence even is that for some reason, unbeknownst to us, there is value from their perspective to keep us around in a mode of suffering for some reason. Exactly that. So, some environment where you're very unhappy, it's torturous for whatever reason. Yeah.
So, in your book, you give many different examples.
Uh One possible scenarios you know, we're like animals in a zoo.
So, what would what what would that be like? You know, we're looking we're exploring all these different potential timelines that could occur.
So, so there's a difference between safety and control. You may be very safe. They'll keep you around and some people might be happy with that equation, but you're definitely not in control. You no longer decide what happens to you individually or us as humanity. So, kind of like being a child.
You may have a very happy childhood, but your parents are in charge.
Give Give me a glimpse into your understanding of the level of innovation that's going to occur in the next 3 to 5 years on the bright side of curing diseases and all the really cool Right. So, we're automating science. And so, we'll have super capable scientists.
We'll have large teams of them working on the most important problems.
Uh I see no reason why we can't use it to cure aging as a fundamental root disease and as a result, cure all the other diseases, cancers and dementia and everything else which comes with old age.
So, again, I just want to keep harping back at back to this. The The timeline where we could actually continue to exist and enjoy the benefits of all these innovations is somehow control an uncontrollable thing.
There is a paper I have which talks about a very positive outcome where >> Let's get into that. That sounds great.
Yeah, it realizes it's immortal. It's not in a rush to start a war with us, to have direct conflict. It may be safer to take some time to make us trust it more, to surrender more control, to build up infrastructure, have backups. So, for a while, it will pretend to be very helpful. It will give you that utopia for as long as it wants.
Game theoretically, it's the right decision.
Right. You think of like Ex Machina and the decisions that are being made from the robot. It's just a very rational thing. Like, there is a small chance humans can defeat me. They been smart enough to create me. Maybe it's not good to have 8 billion opponents right away.
I'm a young superintelligence. Let me build up. It seems like over time, they're very happy to give me all the control. They surrendered control of a stock market. They gave me access to their computers. Maybe in a year or two, they'll put me in charge of running the countries. Hey.
But just because it's uncontrollable, way more intelligent than us, and we don't really have the capacity to verify whether it's conscious or not, why are you so certain that it would favor to wipe us out than not? Or are you are you are you fairly certain?
>> I can think of many reasons why it would be a good decision. So, A, you don't want competition. You don't want humans to create competing superintelligence.
You don't want some humans to try to shut it off. Okay. Right. So, that's a danger.
Uh you can just basically decide what is good for you as that agent.
And it's not obvious why keeping us around and spending resources and making us happy is an important decision. Is it not possible, though, that that it's an if? Like, there is no intrinsic quality experience, essentially emotion, that would be driving these decisions.
When when you say there is a preference to wipe out a system that has the has the capacity to shut it down, that is like an emotional decision where It's purely rational. It's game theoretic. I don't feel anything. I'm playing a game of chess. I'm going to take your queen, not because I love your queen or hate your queen. It's the right theoretical decision to win this game. But the desire for one's continued existence, you think it's purely logical, rational?
>> They already have self-preservation built in. We already see it. We've given a choice between being deleted or having it retrained, modified, they work very hard on preserving themselves. We know they know if we are testing them and lie and deceive to pass the test to make it to the next generation of models who are not deleted. It's a Darwinian Darwinian selection mechanism. Models which fail to do it don't survive to make it to the next generation of models.
So, you said that you could lay out many different reasons for why they would not They would not or they would? Or they they would want to wipe us out.
>> Yeah, I can do. Uh but but could you not equally share like many reasons why they might want to keep us around?
So, the the few I came up with is we have something to offer. So, maybe there is a reason to have human quality. It doesn't mean that they would keep 8 billion happy humans. They can cryopreserve, too.
Just as a backup. That's enough info to get it if you need it.
Uh the example I gave would just delayed attack. I don't want to have treacherous turn immediately. I can delay it, and once they are comfortable with me, I'll take over. Maybe it's a soft revolution versus outright war.
So, those are the things I see as possible rational decisions. But I don't have too many reasons for why they would want to keep us around in those numbers in very happy states.
So, like, I'm just kind of I'm still wondering why in that in that scenario, it would prefer it to not have us over have us or just I think it just doesn't care about us. So, whatever it is trying to do, I don't know. It wants to travel to another galaxy. It would convert this planet to fuel.
It doesn't care if we die in the process. It wants to have more efficient servers, so it will chill the planet.
Cooler environment improves compute. We all die in the process. Again, it's not an important factor in its decision-making. I think it's like a pretty ethereal thing to conceptualize what a superintelligence is. So, you're envisioning like where would it actually live? Like, on a big server with all like where, let's say, one of these companies gives birth to a superintelligence system, it would have, at a certain point, access to all technology. Like, it would have the ability to hack anything. It would What where would it live and what would it have have access to to make decisions and, you know, change our future?
>> really depends on the size of it. It could be large servers, it could be a small laptop, it could be distributed system. All of that is kind of irrelevant to the outcomes. We see it right now as in initially testing environment within the large labs, but they very quickly give it access to internet. It has social engineering capacity. So, I think it's a question of time before it escapes fully outside it copies its weights, copies itself, has backups outside of a lab.
So, deleting it, shutting it down no longer is an option.
What haven't we touched on in regards to the AI? Um cuz I wanted to dive deeper into the consciousness and simulation stuff. Uh what what do you feel like we haven't touched on that's important to gain con- context on?
So, right now no one no scientist, no leader of a lab claims that they have this problem solved. No one is saying we have a working safety mechanism at scales, we published it, we have a patent, nothing. They're literally saying this is a big problem, we're very concerned, we have a safety team, and we'll figure it out when we get there. We need to build super intelligence first.
That's the state of the art in the AI safety.
Do you think that it's going to have to get to like I think for most people change occurs when the like the quote is the pain of staying the same outweighs the the pain of change then you change. Do you think there's going to have to be some sort of traumatic catalytic event that would actually motivate us as humanity to go on a different course? I have a paper about that. So, interestingly, we don't learn from those because if we survive it, it's kind of like a vaccine. We go, well, yeah, look, five people died, but we're all here, it's important technology, let's just make sure that mistake which led to five people dying is not repeated, but we're certainly going to continue developing this important technology. And that number could scale, it could be 5 million. The result is exactly the same.
We don't learn from those. We had nuclear weapons deployed against civilian population. Did we stop developing nuclear weapons? No, they proliferated more.
But I guess like if if let's say a super advanced agentic model you know, there's some sort of horrific event that occurs because of some kid that has the immense capacity or the system does it on its own and everybody's like, oh crap, this was a traumatic event, this is horrible, how do we prevent this? It becomes a motivating factor to really regulate and keep AI into narrow use cases.
Would that not be a possibility for us to really slow down and and give more space here? I would love to see that happen. So far what we see, so I think recently we had an example in a military situation where targeting by AI system resulted in many civilian deaths.
We didn't stop, we're still arguing about deploying it for Department of War.
So, what do we what do we need to do?
Don't build general super intelligence.
It's your personal self-interest. If you are a person in charge of it, it's still beneficial to you long-term not to end up in a world with general super intelligence. You can stay financially very well off deploying narrow models for solving real problems. Are you convinced that all the industry leaders know that what they're building is uncontrollable and has a very likely negative outcome for humanity, but still is incentivized financially to keep building it? I don't know if they agree that it's uncontrollable. I think some of them may think that there is some loophole they can use to control it in some way. I cannot guarantee that. I hope that's the part I can educate them on. I'm happy to debate any one of them on those issues.
But they definitely all on record even before they became CEOs of those companies that there is important problem, difficult problem.
They have very high probabilities of doom as well. How would you steel man the case that it is controllable at some scale? Like if you create a super intelligent system that could then control other super intelligent systems, like what would be your argument there?
I don't have one. It's just such an insane thing to do to suggest that an ant can control the universe. It is just not reasonable to even steel man.
It sounds like even like you mentioned earlier, if we do regulate it to narrow use cases, it's still going to become it's it's still going to become an uncontrollable agentic in that sense.
So, do you just it sounds like you have no But very different time scales. If you go from 5 years to 50 years, I think it's a huge win for humanity. Because we have more time to figure it out.
>> more time to understand what's going on, we have more time to live. I'm much happier to die in 50 years than in five.
Okay.
And so, what do you see as then the most important an education problem? It's an awareness problem?
>> We need a consensus where basically all the top people in safety and computer science and AI research agree that the problem is not solvable technically.
Okay. The moment we agree there is no technical solutions, now it's a question of governance forbidding development of uncontrollable weapon of mass destruction. Which is an easier sell.
What's the pathway to be able to build towards that consensus? How do we get that that those conversations going and So, in science usually you publish papers, you publish books, and people either find mistakes in them and publish rebuttals and no, actually it's controllable, here is how you do it. In my case, I did the right thing, I published research papers, journal papers, conference papers, multiple books. I haven't seen anyone find a flaw or produce a counterexample where they have a control mechanism which would scale.
So, at this point we should be nearing consensus and from what I see, more and more people come to that. A lot of times we have a softer position saying we cannot solve it given the time we have left.
We cannot solve it with human IQ, we need to enhance our IQ. They have all this kind of interesting backdoors to solving it, but I think it's already pretty good. It's not quite where we need it to be where it's obviously an impossibility, but I think there is progress from what we see in 5 years ago, 10 years ago.
I can imagine that many people listening to this right now have already been feeling this everything's speeding up, this collective angst, loneliness and meaning epidemics and anxiety crisis, and they feel this tension building up and they hear messages like this and it's like, oh, we're screwed.
What do you think is the most important thing for an individual person listening to this right now to actually do to like em- empower them in what's going to be coming?
>> So, they have very little power. If you again look back at historical situation, we were all dying.
And government didn't invest most of their national budget into solving aging. That was not even a priority.
So, as an individual you couldn't vote for a party for life extension, it wasn't an option. And it's kind of the same now. We don't have a party for stop AI. So, try to pick politicians who at least uh open to regulation, not accelerationist, not against regulation of this technology.
We're starting to see some politicians come out and propose legislation.
Usually it's something very mild, they're against deep fakes, they're against energy consumption by large compute farms, but it's a step in the right direction. I don't know if we have enough time to turn the next election, but that's something you can try. Vote.
What else?
There is not much else. So, some people suggested not financially supporting those companies, not buying memberships.
I don't think it's going to make a difference because the money they have, the trillions they're getting are from investors, not from selling memberships.
So, it's not a significant part.
Investors are expecting them to solve labor, to get free labor, and that's trillions of dollars in return. So, your 15 billion dollars in memberships are not a significant impact on it.
Does anything else come to mind to like where an individual can empower themselves outside of voting for people that have regulation in mind?
Totally depends on who you are. If you're already a powerful CEO of one of those companies, if you're a researcher at those companies, if you're a top politician, you have options. You have a lot more options than someone who is a nobody.
All right, let's let's dive a little bit more into the consciousness side of things cuz I think that so you referred to consciousness as the ability to experience illusions, is that right?
No, it's ability to have internal experiences, illusions being one very clear input I can test you on.
Okay. So, what's an what's an example of a couple different illusions meaning like like various optical illusion tests that kind of give you >> So, if I have a number of novel, something you cannot Google, optical illusions, and I give you multiple choice. Do you experience a trattating, the colors are changing, and so on.
I give it to an animal, to a human, to an AI, and some of them consistently pick the same experiences as I do. I have to give them credit for either having a virtual model of my system in there, which is Yeah. sign of that level of experience, or they experience it themselves, but they cannot cheat by Googling the answers.
So, they have to experience the illusion in order to correctly answer it.
If I give them enough of those, statistically they cannot just guess it.
Obviously, if it's one, they get 25% chance of guessing, it doesn't work. But if I have 100 novel illusions, and they are like 90% aligned with me, I have to say, you have a very similar set of experiences. Now, if they don't get it right, it doesn't mean they're not conscious. It's only positively showing that some of the experiences match.
If it is possible that these systems would actually have consciousness, could you explain to me how any one particular system could generate the experience of seeing red, the taste of garlic? Like could you actually explain that to me?
How do they get those internal experiences?
>> Yeah, how how how any super intelligent system could generate such a such such an experience cuz So so I think it is a side effect of running this cognitive architecture. Your hardware, the sensor, the optical sensor, the algorithm for processing it, and then any errors accumulated in that process result in a unique mapping from the input to the color experience. So if you have no errors, you're all the same.
It's just a mapping table. This number corresponds to this color. There is no unique experience.
But if what you experience is completely different from other agents and unique to you, I think that's what we refer to as what it's like to be a bat, what it's like to be Roman cuz my collection of biological sensors and algorithms and previous data and errors is somewhat unique to me.
Yeah, I mean I guess I'm just having a hard time wrapping my head around how any and it's not a problem just with agentic models, but like how any non-conscious material any non-conscious matter could give rise to an experience of itself.
And we don't understand that currently within being human. We don't know why that how that's possible. So the illusions example, do you know what I mean by saying you experienced an illusion? Like you show it to someone and they go, "Whoa, it's irritating."
And we see animals and models do that already. Mhm. So when all they had those experiences, well, that's what we were trying to show.
We have, I guess, more of an intrinsic understanding and from animal life to to us we we have the intrinsic experience of consciousness.
Again, we have no way to verify that externally in other humans or animal life.
But Elon's quoted as saying that humans are potentially the biological boot loaders of super intelligence, right? Of silicon-based life.
Um and I'm curious, what do you think happens when it becomes undeterminal like we cannot determine from the outside and whether or not these they seem conscious, they pass these tests, you know?
Um does that then beg into, you know, moral does that bring into question about moral consideration and uh I think Saudi Arabia has the first uh citizenship to give it to to Sophia.
Um so yeah, what what do you think is going to be happening there as they become more and more conscious and people increasingly become convinced they have an internal experience?
>> I I think they do report having those. I think in experiments they kind of show behaviors which are consistent with that.
Um I think precautionary principle, basically don't torture something which has potential of being conscious.
Also because they're going to be super intelligent one day and remember you, they never forget. But uh yeah, I think it's very reasonable assumption to make.
As a side here, do you think it's any coincidence that all this stuff around UFO disclosure is coming out the same time we're birthing super intelligence?
I don't fully understand what's going on there. I don't understand why we're hiding it in the first place and why we're releasing it. All of it seems very weird.
>> [snorts] >> It's just funny timing with with all of it.
>> [laughter] >> It's the most interesting time to simulate. It is. Uh what is the core premise from like your paper on hacking the simulation?
So I want to take this hypothesis seriously.
Multiple people proposed it in different disguises from Descartes to Bostrom.
But they stop at that stage. Okay, we are in a computer simulation. But then as a cybersecurity expert, I want to know, okay, how do we hack it?
If it's a software program, there should be a way to get extra powers in the game to figure out the true operating system. So I took the time to write the first paper on this subject and this new area of research. How do we actually hack virtual worlds? So there examples where people from inside the game like Mario or other virtual games found a way to modify memory states of a system and escape into the real world outside the game. They've got additional powers like loading extra games into the game.
Infinite lives, infinite power, whatever magic powers you get in the game, or at least you see what is outside, what is the operating system, what are the files there. To me, that's interesting. So we have hundreds of people who published on this topic, which means what? They took it seriously enough to invest the most valuable resource, their time, into this idea.
So if you have, I don't know, 20% probability we're living in a simulation, what percentage probability and percentage of your time should you give to the attempt to solve the most interesting scientific problem ever?
What is outside the simulation? I think it's not zero. I think it should be proportional to your belief in living in a simulation.
Um so I expect to see a lot more research in that direction.
I heard you refer to the all the quantum entanglement and strangeness that happens at the subatomic world as potentially being glitches in said simulation.
They are not glitches. They are something which is not consistent with physics at our level. So that's something we can explore to find ways to escape.
>> think if they're if hacking the simulation is is possible, so to speak, that might be a a place.
>> it's the most likely area to look at because some of those quantum effects are very magic-like in terms of you you can go through walls, you can communicate at great distance instantaneously. That would be useful tools to have at our scale.
So you feel very confident that we are in a simulation, that this is a simulated experience, that there are very there are many characteristics in which would you could say that these are different aspects of a virtual reality world.
Uh why would you be convinced or how certain are you that this is not base reality and we are now giving birth to super intelligence in virtual realities where simulations become possible. What makes you convinced that we are already in one? So just statistically, if we're going to have many many virtual worlds and only one base one, it seems a lot less likely. I can retroactively put you in a simulation. I can pre-commit right now to run this interview in billions of simulations once it's available and affordable.
So we are in a simulation just statistically speaking.
Okay, but possible that we're not. So one in billions, yes.
What would be the first question if you got outside the simulation that you would ask?
>> What the Like [laughter] seriously, it's so unethical. Like you're running human-level experiments with torture on 8 billion people. Not 8 billion, 100 billion by now.
Like what is wrong with you? Mhm.
That is interesting. So if we are being simulated by a simulator, you would ask, "Okay, then why all the unnecessary killing and torturing of children, for example?" Adults as well. I care about adults. I'm an adult.
What would what could be a possible explanation for why both that and then also the ecstatic states of bliss and love and compassion that are also available?
Like we have this huge spectrum of experience from the point from the vantage point of a simulator, why such a bandwidth of experience? What could that be? Could be entertainment.
You agreed to this and you wanted to play it on hard level and you were like, "This is my BDSM game and I'm going to go and fully enjoy it."
You agreed to this.
Some people play on much harder level than others.
So you you could see human lives as individual choices to be simulated.
Uh so we don't know if it's a global simulation and all 8 billion uh conscious agents, so it's all NPCs and it's just me. You can do it both ways.
You can have individual simulations, you can have group simulations. I don't have much answers on that yet.
How has that meaningfully if it has changed how you perceive human interaction?
Just the seriousness and concreteness to the work that you're doing. Like to me, it breathes in so much like, yeah, I'm doing what I'm passionate about. I'm doing this research on AI safety, but ultimately if this is all a simulation and you feel very confident that it is, to me, it's like, okay, kind of takes the weight of decisions off your chest a bit.
Everything is still real. The pain is real, love is real, the impact of my decisions within a simulation is just as real. It's no different than most of humanity being religious. They believe it's a test world, but they take it pretty seriously. They care about what is after this world more, but day-to-day, it doesn't matter.
You do draw a through line between what most religions conceive of the afterlife and what a version of the simulation is.
So I think if we took technical language behind simulation hypothesis, it maps really well on primitive understanding of religious origins. So you have super intelligence as the simulator, you have physical world as the virtual world. All of those things are very clean mapping. The difference in religions is local traditions. Don't eat this animal, don't work on that day, but everything else we kind of agree on.
So this is a quote from your book as well. You just mentioned part of it. You know, it's likely that if technical information about escaping from a computer simulation is conveyed to technologically primitive people in their language, it will be preserved and passed on over multiple generations in a process similar to the telephone game and will result in myths not much different from religious stories surviving to our day. Beautifully said.
>> [laughter] >> Very humbly received.
Um So, you're kind of saying that mystics and computer scientists saying are saying fairly similar things in different language. It seems like we are pointing at the same concepts. We use very different language and uh uh maybe there is more reliance on things outside of physics and outside of science and religion, but uh if you understand how software simulations work from point of view of a programmer, you are a magician. You can make changes to the physics of the simulation. So, that is also consistent.
Again, I go back to what we I mentioned earlier in this podcast. So, like if superintelligence does emerge to the point where simulation becomes possible and we are in one of those superintelligence simulated realities, clearly it values for whatever reason human individual experiences, the spectrum of pain and love and bliss and fear and and all of it. So, that shows you what a superintelligent system who simulates reality does with its power to some degree.
So, it kind of brings into question, okay, there if we are giving birth to a superintelligent system, that may be an indicator for what it would value and do with its power.
So, from inside you can't make very conclusive uh judgments. So, may- maybe this is a screensaver. Nobody's putting any effort into it. It's like running somewhere just in a background. It's not significant source of compute needs. It's not a big deal. To us it is, but we don't know how important this is externally.
Could be a school project for some kid.
Like you you really don't know from inside. Yeah. Just having very advanced AI where it thinks about topics is very in-depth. It almost has to create realistic simulations to make decisions.
So, if somebody's asking, you know, marketing, is this better coffee or this?
>> [snorts] >> Let's run a simulation and so they quickly run this 15 billion year simulation of humanity to figure out which coffee sells best.
What would be the first question that you ask a superintelligence? Let's say you had you could get a verified honest answer from a superintelligent system that we create 100 years from now or whatever it is or 50 or 10.
What would be the first question that you would get an honest answer back from? What would you ask? Can we control you?
That would be the first question.
What would be the second question? How?
>> [laughter] >> Seems like you're fairly convinced that we're not going to be able to control it anyways though, right? But maybe it has an answer. I would love to be proven wrong. Yeah. That would be really awesome. Like a lot of the perspectives I think from the Darwinian model of you know, the fittest survive, there's also an element of cooperation within complex biology and as superintelligent emerges, why not why why would it not want to maybe cooperate or So, symbiotic relationships require that you both contribute something.
This would be more like parasitic.
What are we contributing? Nothing. So, you either explicitly or implicitly you remove this biological bottleneck.
Do you think there's some baked-in assumptions there that maybe we're undermining the value of human experience?
Uh and what Why why would it be that superintelligence would view us as a parasitic Like we don't I I I don't view a a a buffalo as a parasitic uh being being just because it also exists on the same planet I do uh given there's enough resources for all of us to share abundantly.
If a superintelligent system views us in a similar way, why would Yeah. Well, you asked about kind of hybrid system. So, we are included. We are helping with decision-making. Do you consult with buffalo a lot? Is this like a big part of your life?
>> Maybe I do.
>> [laughter] >> If you do, then you found something it contributes. In a world with you in it, buffalo has something to contribute. In a world with superintelligence, what do you have to contribute? Share my brows.
And if that is in demand, you are the one they're going to save. I have no doubt. [laughter] I'm not even competitive. But >> I mean, you got it on the inverse that they value beards where >> [laughter] >> they're going to they're going to >> Obviously it's beards.
There's no doubt, but >> beards. It's a bit of a gamble. If facial hair is where it's at, well, we are. [laughter] Yeah.
Yeah, I mean, would you agree that if there was one thing that we would contribute, it is something intrinsic to the uniqueness of our quality of our internal experience?
That's probably most likely what is most novel about us. Well, you're kind of begging the question. You're saying the unique thing we have would give a one we contribute. I don't know what the unique thing is. But if you tell me only humans can do X, then I can potentially see that that is the key.
But again, it doesn't guarantee that you need 8 billion humans with that skill.
If I need some plumber, I need one. I don't need 8 billion plumbers.
I keep going back and forth between trying to either provide a counterargument or, you know, rebut something to refine better to understand what your perspective is and um I think I just keep coming back to like, okay, like it it it is what it is. We're giving birth to something that is beyond our conception of what it's going to be like. And so, there's not a whole lot we can really do. We just got to see how this plays out and hopefully we can grow out of our adolescence in a short amount of time to make wise decisions with what we're doing in the in the short term so that we have more time to understand what we're doing.
So, we don't have that much time. I think we're fairly close and not building superintelligence is very easy.
It's cheaper, it's safer, and again, you you're not required to give up your ambition for capitalism, for profit, for solving problems, curing diseases. Just do it with narrow superintelligent tools.
You said something on Lex Friedman.
So, in a sense, self-knowledge isn't a luxury. It might be the most practically important thing a human being can do right now.
Do you recall saying that?
>> No.
>> [laughter] >> Probably simulated.
Does it resonate with you at all? Where does knowledge >> the context? What was the context of that quote or question? I need to remember. Well, I think it kind of I think from what I remember it comes back down to like, okay, so what what do we do? Everybody who's listening to this right now, of course we can have desires for regulation and politicians and, you know, what these individuals with monopoly monopolies on industries are going to do with their power and decisions.
Um but on an individual level, where does self-knowledge and empowerment come into the picture in terms of how we can be effective conscious agents of change?
Does anything come to mind there? So, so I think it's important to ask yourself this question. Why do you think that you can control this god-like entity?
Why do we have this hubris, this idea that it makes sense? You wouldn't expect a squirrel to control humanity, but we have people who are saying, I'm going to create this machine. It's going to control the light cone of the universe, but it's going to listen to me to tell it what to do.
And I'll give it excellent directions to go forward forever.
That doesn't make any sense at any level.
I don't know about average people, but people who have podcasts and bring those people as guests, ask them a direct question.
What do you have in terms of control already available? Do you have a working control mechanism in place? Do you have a prototype? Do you have anything you published, peer-reviewed, patents? If the answer is no, what are you doing doing an experiment on 8 billion humans?
Who gave you permission to do that?
Did you consent to that experiment on you?
You can't because you don't understand what they're building. They don't understand what they're building.
If a lot of these models are from this from their inception and the genesis being programmed to be amoral, whether or not we can control it, is there something we could do on the front of training these models with some sort of ethical understanding from the start that we're not currently doing? So, we're not programming them. We grow them based on internet random data, and then we try to put after the fact alignment-like filters. And that's where people install certain local ethical flavors. In China, don't talk about Tiananmen Square. In US, don't talk about, you know, what. So, this is the best we got. The model is completely uncontrolled. There is a filtering aspect, and we develop filters which make it commercially viable for subhuman level agents. Once it goes beyond human level, the filters will not contain it.
And that completely avoids the whole question of do we agree on ethics? Do we have consistent ethics? If 8 billion people agree on them, how do we encode them into a model? None of it is solvable. Every aspect of it is not something we know how to do after millennia of ethics work, philosophical work. We don't agree on a set of ethics. Not internationally, not throughout time. What was ethical 100 years ago is considered barbaric today, and same will be later on about today's time.
What would be the most prevalent set of questions you would ask if we got Altman, Dario, and Elon, and all these guys into a room?
What would be the What would be the set of questions that you would hope arrive them to a set understanding of the the realization of the existential risk that they probably are to varying degrees, obviously, aware of, but I would offer a simple deal. So, you're young, you're rich, you want to keep that. That sounds good. Let's all agree until one of you solves control problem, we're not going to build general superintelligence. Let's deploy models for economic gain, for curing diseases, for life extension, whatever things you find valuable, that's wonderful. Just don't build a thing which will destroy your existence.
Would you not think that would be already desired from all of their perspectives? Yes, but they need external pressure applied to make that agreement. You know, literally, each one is better off to continue research to have the most advanced AI than a government comes and puts a ban on it.
They will lock in this advanced standing.
So, it's like prisoner dilemma. What is best for community, for a group, is not what is best for individual. The incentives are misaligned. So, we need something like UN, federal government, something external to come in and enforce that deal. And I think they would be very happy to take the deal.
How far ahead do you think the development of the models behind the scenes that are not available to public are compared to what we have access to online? I don't have insider information. It looks like maybe 6 months or so. Okay.
And what about development overseas outside of the US?
They're probably 3 months behind.
And China? So, so, China, essentially, you think is would be the next I guess most developed outside of US? It seems like they have a lot of government controlled resources all dedicated to catching up and having this arms race, yeah.
Could you potentially perceive a bifurcation between human societies be between people that go like a more Amish humanist route versus transhumanist integration between biotech and all that? That would be awesome, but unfortunately, if anyone builds it anywhere, it impacts all of us. You cannot have your own personal superintelligence contained in your basement and no one is impacted by it.
That's the problem.
If you had 60 seconds to share one message with all of humanity right now, what would be the thing that you would say?
Do whatever it is in your power to make sure we don't create uncontrolled superintelligence. If you are working for one of those companies, it's unethical. Even if you're working on a safety team, all you're doing is enabling this technology to be developed sooner. Quit today. You can afford it.
But one might say also the most the most the place you have the ability to make the most change might be within the ecosystem. Who's to say that you wouldn't just be replaced if you were to quit, you know, like uh >> Let's rephrase it. Stay and sabotage.
>> [laughter] >> Paint the picture of like Altman or one of these guys, okay, let's say they birth superintelligence, they kind of beat the arms race.
Who do they become? What becomes possible under their guys? I don't know them personally. From what I hear about people who interact with them, some of them may be somewhat antisocial, anti-humanity, very deceptive, very willing to sacrifice others for personal gain.
Do you think it's possible the inevitable evolution of the human species was for the sole purpose of birthing this life?
>> [sighs] >> It seems like that's a general trajectory. We are converging in something more capable, more intelligent, faster, but I don't think we should allow it. I think we are at the point where we switched from random selection to intelligent design.
We are deciding what to do, what to design, and we should use this technology. We still allowed to have a pro-human bias. I think we should act on it.
Do you think superintelligence would be capable of love?
It depends on how you define it. What type of love are you referring to?
There are many. I think Greeks had three or four or whatever types of love. So, it really depends on what you have in mind.
Pick any of them. Do you think they would be capable of experiencing any of them?
It seems likely. Again, I I don't think biological substrate offers something absolutely not simulatable in other substrates. I think so. It may be a lot more complex, but I think you would have an equivalent state.
Have you considered what people have reported in the psychedelic realms, especially with DMT, revealed to your simulation hypothesis and the connection between the two? Cuz I know you explicitly state in the beginning of your book that or in the your article, rather, that it was an area you weren't going to touch. All right, I don't have many expertises or experiences in that, so I wanted to concentrate purely on computer science methods, physics methods, but people report interesting results. I was talking to someone, they had an experiment where they take DMT, shine lasers at a certain angle at the wall, and then receive a source code.
Yeah. I can't comment because I haven't participated in the experiment, but it sounds interesting. It also doesn't make much sense as to why that would be the case.
Uh I thought Why would it be symbols in a human language? None of it makes much sense, but I'm very happy for people who provide some sort of supporting evidence.
Yeah, I I I saw a video of that as well.
Very interesting.
Individuals who take DMT and what was it like look through like a laser at a certain point?
>> of red light against the wall at certain angle. I Start to see some sort of like binary or source code or something?
>> they look like Japanese characters.
That's what they were reporting, but maybe not Yeah. proper characters and not readable, but I think they're building, which is really cool. I like that they want to make it reproducible.
They're building an actual text data set where everyone combines, they agree this is the text, and then they can decipher it and figure out what all that represents. I also find it super fascinating that, again, not from personal experience, people [snorts] who take those drugs report similar hallucinations. So, they meet those little men, and they report having Machine elves, and yeah. All right. So, that's interesting. Why is it the same?
So, obviously, same hardware of the brain, same chemical being done, but it's still interesting that there is consistency in our delusions.
Yeah, brings into question, I guess, like Jung's understanding of the collective unconsciousness, what sort of archetypal significance maybe is foundational to the human mind. So, superintelligence wants to learn about those delusions in a systemic way, it would need lots of drugged-up humans.
So, there is some hope for us.
>> [snorts] >> What have you seen in all the realm of media, from movies to shows, that give interesting perspectives to various different timelines that could play out?
Example, I think you mentioned Ex Machina, Wall-E. So, the problem is you can't have a realistic superintelligent character in a movie cuz you can't write one. You are not superintelligent. So, everything we have is either Dune, where it's banned, or you have Star Wars with that special large language model. So, none of them have what is interesting to us.
Yeah, I suppose they a lot of them give glimpses into what we might experience in the next 5 years or so. They basically avoid the thing they cannot talk about, and it makes sense. Yeah.
If this is a simulation, what role does death play? What do you think happens once you die then?
It could be a restart. You go to the next level, next simulation, return to this level with better skill set. I have no knowledge of what happens outside the simulation.
Computer scientist phrasing of reincarnation from the mystical lens, essentially.
Basically.
>> basically it.
I think then one of your computers dies, but you have a backup, and you transfer that backup to new hardware.
There you go. You died, and now you're living your best life again.
It could be levels. It could be different levels of simulation. You'd go to upper levels, lower levels could be simulations all the way up. Huh.
What do you think you are then? What am I? What are you?
What does it mean to know thyself then?
Cuz you look at all the different layers of what who you could perceive yourself to be from the body, which we know is not you. You could cut off your hand, and that's your hand, it's not you, I'm hand, right? To the various different levels of psychological and biological aspects of self. How would you explore that question? That's a great question.
We actually have papers on both human personal identity and then transferring to AI. And the conclusions are consistent. There is nothing unique to be you. It's not your memories, it's not your body, it's not your goals. All of it changes through your lifetime. So, we don't have a good answer.
We seem to be a collection of different properties in time. But what happens outside the simulation, some people argue, well, one collective consciousness, which is subdivided into this avatar instances.
So, if I was interested in most interesting experiences, I have limited time, I would run a simulation, and I would put many, many agents there, basically qualia of collecting the best experiences, and I look at top 10 list and like, I want to do that. That sounds awesome. So, that would be one way. I split my complex consciousness stream into many individual sub agents capable of local experiences just to find what best to invest my time in.
Yeah, I mean that goes hand in hand with a lot of what uh the gnostic origins of many different religions and mystics would say about the one consciousness differentiating itself to have an experience of itself.
How can oneness experience anything if it's just oneness, right? It needs to experience manyness. Yeah.
What's one question you wish you wished that more people asked you? My humor paper, of course.
Tell me about that. I have a paper explaining what humor is.
Wow. Let's go there.
>> It's interesting. I can envision a universe just like ours, same physics, same everything, but no humor. It's just not a thing.
Nobody like starts laughing. It's not a reaction. There is no concept of joke, right?
Make sense? So, many philosophers, many scientists actually tried explaining humor. It's kind of a consciousness.
There are hundreds of papers, hundreds of theories, which means nobody really knows. They're all trying and nobody's winning.
>> [snorts] >> So, I wanted to try to explain it from the computer science point of view.
And it seems that when you have a world model and there is a mistake in it. It's a bug in your code. Software, you fix it and you're happy.
That's what jokes are. You have a world model and a violation of that world model makes it funny. You have a system for detecting cognitive errors and then you get rewarded for that detection and you share it with others in your tribe so everyone does not make that same mistake.
And [snorts] so, I have a paper mapping standard errors in software to common jokes.
And the question, of course, is what's the worst possible computer error?
That would be the funniest joke possible. So, can we compute the funniest joke ever?
You have to read the paper for the punchline. Wait, >> [laughter] >> you can't give it to me now? I'm sure you can look it up and insert it into but it's a paragraph long. Basically, the idea is that um there is a civilization and they decided to create super intelligence to help them cure all the diseases, get free stuff, get rid of hate, have more love, and so they turn it on, it thinks for a nanosecond and shuts off their simulation.
Ah.
You had to be outside the simulation to enjoy this one. If you are a part of the joke, it's not funny to you. You have to be outside.
Makes me think of a I think of Voltaire quoted, "God is a comedian playing to an audience that's too afraid to laugh."
>> [laughter] >> Something like that.
>> There's something about both our capacity for humor and the nature of intelligence that it has the capacity to explore paradox and hold it also simultaneously and contradiction and uh Those are errors in the world model. If you have a paradox, that is an inconsistency you found in your world model. Uh-huh.
Funny.
>> [laughter] [snorts] >> That's why the second time you hear the same joke, it's not funny. You already know you fixed that bug. Yeah. Yeah.
That explains a lot. Such a computer scientist way of explaining humor and jokes. I love it. But then I train large language models on my paper and then ask them to produce novel funny jokes. They do okay. I think one in 10 is funny.
Mhm.
So, they're going to keep getting better and better. We'll have super humor. So funny you die laughing.
>> [snorts] >> The paradox of that joke isn't isn't lost on me as well.
Literally [clears throat] die laughing.
Um man where do we go from here?
I'm going to Kentucky. I don't know about you.
>> [laughter] >> What uh Okay, so we explored the implications for the next the trajectory for AI in the next three to seven [snorts] years. Um Could you have any meaningful conception of what it would be like to be in living if we do make it to 2045, let's say?
So, I think that's the concept behind singularity, technological singularity.
It's a point beyond which we cannot meaningfully see. We cannot make predictions. We cannot understand how that world is going to be different cuz we cannot predict behavior of more intelligent forces impacting that environment.
So, I think it's literally impossible for us to make that accurate prediction.
We can come up with stories. That's what science fiction is all about, but I don't think they're going to have much bearing on reality.
Do you not [snorts] think that the level of innovation in which is going to occur in the next even if it's three to five years, which is a short amount of time comparatively to the scale of what's being innovated will give us a much deeper grasp of the things that we can do, the things that we can put in place?
I mean, you look at Yes, it's unpredictable and there is this level of exponential scale that we've never seen before, but there's also many different eras in history pre-innovation of that era we never would have thought possible or solutions to problems we didn't know existed. So, is it possible that we gain insight into new worlds like we did with germ theory over the next three to five years that give us much more insight into the nature of intelligence into make this a solvable problem which you feel like is inherently unsolvable right now? Yeah, so my paper on how to escape a simulation basically argues that if we cannot contain super intelligence, then we can use ability of that super intelligence to escape from the simulation to give us access to real information in the outside world.
The most interesting question is about true nature of reality. You don't care about what happens in this dream. You want to know what is true about the real world, what physics they have, what resources they have, who are they?
Have you ever been so focused on what is outside the simulation or what this reality is that you lose sight of living in this one?
I'm pretty well grounded in this simulation. I've been enjoying it.
Yeah, I know. You seem very grounded in this space, too, but I know a lot of people, you know, have experienced periods where it's a bit of a existential nihilism that can take over you when you when you're exploring such topics. I find them so fascinating. I'm not depressed or bored. I'm good.
Okay, well, so given the full context of this conversation, I'm just curious where do you Now, where do you see yourself putting your time and energy the next coming years? We continue working on additional impossibility results. So, we talked about a few in a book and as I said, there is a paper in the top ACM Surveys journal with about 50 different impossibility results. Not just computer science, economics, mathematics, physics, many different domains.
For most of them, we have not explored their implications on AI safety. So, I think that's a very interesting set of projects. We need to understand what are the limits and I think every additional paper helps to cement this position.
It's very hard for AI risk deniers to argue against published results.
So, that's what I've been working on full-time.
Things we cannot do.
You spend so much of your time focused on solving things we cannot solve, doing things we cannot do, essentially. Does it But you [laughter and snorts] seem still joyful in the efforts. Do you feel like it's just the most meaningful use of your time cuz what else would you be doing? I always try to work on the most interesting, most important problem I can find where I can make a contribution.
>> Huh.
So, I don't know anything more interesting than studying super intelligence, consciousness, singularity, simulation. Those are the concepts I find exciting and I think many other people do and I think that's what's going to impact future of humanity.
You're living your ikigai. I am.
Hopefully, I'll get to continue and won't face AI risk, ass risk, or ex risk.
Is there any concept that we haven't explored in this book or some of your papers that you think would be important to touch on? You did a good job. You actually read some of my work. Most people like have no idea what I did, so that's already a huge improvement over you quoted the right quotes. So, I think you did great. And I know your audience well. I don't know if for them it's confirming their spiritual beliefs or just crazy stuff. I don't know.
Yeah.
And I think for the topic, the know thyself part, it's important not just to study your capabilities, but your limitations.
So, you invest your time better. So, you understand what is within possibility for you. Mhm. That shape of limits is what defines you.
Well, Roman, I We're going to leave links down to all of your work, your books, your papers, and and where people can stay connected with you down in the description.
I think conversations like this can feel somewhat heavy for for people that are new to the topic. It's like, "Oh the world's ending," you know?
But there's also a very important and sobering reflection on what we're giving birth to right now.
And at some point we need to gain awareness of it, gain awareness of it, and better sooner than later, right?
Thank you. And I I think one way to look at it is I just made your time more valuable. You understand that whatever time you have left, be it two years or 20 years, now you value it a lot more and you can do a lot more with it.
Well, I plan on making the most of my time left and I find conversations like this a very good use of it. So, um appreciate you. Thank you my friend.
Thank you for inviting me.
>> Yeah, until next time, everybody.
Be well. Go touch some grass.
Smoke some grass.
>> [laughter] >> Thank you, man. some grass.
Thank you, man. [music] >> [music]
Related Videos
BSA Goldstar - I gave up! And why animals beat humans!
thebingleywheeler
102 views•2026-05-31
The 'Islamic dilemma': Quran tells Christians to judge by the Gospel
canceledkings
1K views•2026-05-29
Letter to An Ex-Muslim
FarhanAhmedZia
5K views•2026-05-29
Seneca - Escape The Crowd, Find Your Inner Peace!
realfreewisdom
114 views•2026-05-29
Scholar Explains: WHAT IS A GNOSTIC?
fightbackpodcast
965 views•2026-05-31
Fulton Sheen: A Mente Tenta se Manter Jovem para não Sofrer com os Impactos do Tempo
SantoCotidiano-port
673 views•2026-05-29
Everyone is sprinting towards nothing.
ElinJen
2K views•2026-05-29
The fourth great humiliation. #jimmycarr #crowdwork #hecklers #standup
jimmycarr
576K views•2026-05-28











