Install our extension to search inside any video instantly.

Richard Sutton: AI Creativity and Discovery
Added: 2026-05-13

124 views215:07SAIRfoundationOriginal Release: 2026-05-13

Generative AI systems face a fundamental limitation where they can produce output that is either novel or good, but rarely both simultaneously; to achieve true creativity and discovery in science and mathematics, AI systems must incorporate a three-step process of variation (generating diverse possibilities), evaluation (assessing which possibilities work based on clear objectives), and selective retention (keeping the best results), which is absent in standard supervised learning and backpropagation but present in reinforcement learning and combinatorial search approaches.

[00:00:01]Good day, ladies and gentlemen. I regret that I'm unable to be with you all today to engage in a back and forth discussion, but I'm nevertheless pleased to be able to share with you via this recording some high level thoughts about the current and future state of artificial intelligence and in particular about AI's relationship to science and mathematics which is as I understand it uh the central focus of this meeting and of the Sarah Foundation.

[00:00:29]I would like to start with an old joke.

[00:00:31]I'm sure you've all heard it before.

[00:00:33]This is the one about the researcher whose work is being evaluated and the review comes back and says, "This work is both novel and good."

[00:00:44]Unfortunately, the parts that are good are not novel and the parts that are novel are not good.

[00:00:54]My first point about AI is that this assessment, it applies exactly to large parts of AI as we know it today. Not all of AI, but a large part of it. Pretty much all of what we mean by generative AI, which includes large language models and the images and the video models and even new methods for learning world models. All of these AIs take a large number of examples and produce a model which behaves similar to the examples that is which generates text like people or images like artists or nature and videos like we find on the internet.

[00:01:33]Now don't get me wrong generative AI can be extremely useful. There's no doubt about that.

[00:01:40]But the assessment of the joke still applies. These systems can produce output that is both novel and good, but not at the same time.

[00:01:50]In many ways, this is just absolutely not a problem. When we ask an AI for an answer from the internet or to summarize a document, we don't want it to be novel. We're happy if the if the quality of the answer, the goodness comes from the source material, from the people who wrote the document or the articles on the internet.

[00:02:11]Um, if the AI's answer is novel, it means going beyond the source material, adding something beyond it.

[00:02:19]This is what we call hallucinations. In most cases, we don't like it when when AI makes something up when it adds something novel.

[00:02:28]One exception, of course, when we're not looking for facts or reality, but for fiction and entertainment. We might ask for a bedtime story for a child or for an image based on existing images but which is nevertheless different and distinct from them. In these cases, it is never easier easy for us to know how creative the the AI is actually being as we do not know how close the AI story or poem or image is to the source material.

[00:02:56]In a real practical sense, we can't know this. We can't know this because the internet is too big. the possible sources that the AI may draw upon are too numerous.

[00:03:06]Now, when we ask for a fiction or a novelty, the AI can give it to us because its processing is in part stochcastic.

[00:03:16]Every decision that the AI makes can go multiple ways and it will go different ways and produce a different trajectory every time. The trajectory can be random and thus no novel or it can be based on the training data and thus good because the training data is good sourced from people or reality.

[00:03:36]Thus the trajectory generated by the AI system is either novel or good. It's either based on randomness or it's based on data. But it is never both. It's never both at the same time.

[00:03:50]So really I think it's okay if the output of generative AI is never good and novel at the same time. You know for the researcher in the joke uh this is a devastating criticism. Uh but for most things in the world for most things that you do uh it's not terrible to not be both novel and good and particular for generative AI it's fine. Generative AI is meant to be a mimic. This this is what supervised learning is for. Uh generative AI can be extremely useful even when it just mimics if it is faster or cheaper or smaller or more customizable or more copyable than the thing being mimicked. It is okay if creative AI if generative AI cannot be both novel and good at the same time.

[00:04:40]It's still a transformative technology.

[00:04:44]But this inability to be both novel and good is a limitation, an important limitation.

[00:04:51]Remember, we are here to use AI for science and mathematics. And for these areas, the assessment of the reviewer in the joke is is devastating. For these areas, we need true creativity and we need true discovery.

[00:05:04]Generative AI, sometimes I call it mimicking AI, it will never get us get us there. For these, we need something more. And indeed we have something more in other parts of AI. We have AI systems that that can give us more. We have Alph Go with its world changing Move 37 or Alpha Zero with its brilliant original chess playing style. And we have GT Sophie that drives simulated race cars better than any person. We have alpha fold and alpha proof and al claude code which have brought us advances in science and mathematics and science and and programming. We have RL lift which optimizes the assignment of cars to passengers in the ride sharing business.

[00:05:54]All these systems have found things that are both novel and good. And truth be told, some language models have been augmented in ways that make them more than generative AI, uh, more than based just on supervised learning. But to the extent that they are, this limitation applies.

[00:06:13]Now, all these other systems that I've listed have some additional features which make them more uh, which makes them capable of true creativity and true discovery. And it's important for us to recognize this that this uh ability to to create and discover is not present in ordinary garden garden variety generative AI is it is something that cannot come from just supervised learning. Uh cannot come just from learning from examples. So what is this thing this this additional ingredient?

[00:06:52]Well, it's a simple thing a common sense thing. It's not really new. You have many names. We we have many names for it. But unfortunately, we don't have a really good name for it. Let me just call it discovery as I have. Basically, discovery is just the idea of trying many things and seeing which work, then keeping those that have worked the best.

[00:07:13]So, evolution by natural selection works this way. The scientific method works this way. Just ordinary life and learning works this way. We try things and we remember what works. What could be more obvious in this behavioral case?

[00:07:27]The last case, uh, psychology has two names for this discovery, its two names are instrumental learning and operate conditioning.

[00:07:39]In machine learning, we have the name of reinforcement learning. Reinforcement learning is basically this discovery ability in computational learning agent.

[00:07:50]We also see discovery in planning and in combinatorial search. Really anything that involves the idea of generate and test keeping the best.

[00:08:03]Okay. The essence of discovery is combine three steps. Variation, evaluation, and selective retention.

[00:08:14]These three steps are are key. And it's more than supervised learning which doesn't have the evaluation or attention.

[00:08:23]Now, now I'm not the first to identify these three things and point out that they're important. This combination of three things that I'm calling discovery are important. Uh it's important to science and then natural selection and to animal behavior. uh they're particularly I think of papers by Donald Campbell and Daniel Dennett and Gary Chico. Uh what is new in my remarks today is to directly relate the idea of discovery to modern AI and to help us see that it's not present in supervised learning and generative AI and in particular that that discovery is not present in back propagation or gradient descent.

[00:09:05]Well, let me be a little more explicit.

[00:09:07]What is missing from generative AI?

[00:09:11]As as as I've remarked, uh these generative AI systems do have sarcastic aspects. So they do generate a variety of trajectories and behavior.

[00:09:24]The missing part is the evaluation step.

[00:09:26]The generator of gener generative AI was pre-trained by supervised learning and that leaves it no way at runtime to evaluate what it generates. And of course without evaluation there can be no selective retention and thus no discovery.

[00:09:44]It has variation that can bring novelty but without evaluation there's no discovery. And arguably there's no creativity. that I that is I would say that creativity requires that the things generated be evaluated. Without evaluation and retaining the best there is nothing created. The novelty flickers into existence but if its value is unrecognized then it flickers away and is lost.

[00:10:11]I think what we really mean by creativity uh requires not just random generation, arbitrary generation but a recognition of value and a retention.

[00:10:23]Now in many cases uh when we use AI uh we we the evaluation part is provided by the human users uh just as like when we use generative AI AI to make a bunch of pictures for us and then we pick the picture that we like the best. So there's the random generation there's a human evaluation and the overall process is a discovery.

[00:10:50]discovery of a good picture or a good writing or a good poem.

[00:10:58]But the more important case is when the evaluation comes from a clear objective.

[00:11:04]Uh so some moves are better than other. Some some moves lead lead to checkmate. Uh some steps some mathematical steps lead to a proof and some don't. uh some actions that you take in the world result in high reward and others don't.

[00:11:21]Some genotypes make more copies ultimately. Some theories explain the data better. So the evaluation comes from the domain and what you're trying to do. So in these important cases where we have discovery, we have evaluation coming from a clear objective that's known to the system.

[00:11:40]Now let's talk a little bit more about the variation step. Some people like to call it blind variation where blind here means uninformed variation, a shot in the dark sort of thing. It it uh but you don't need for your variation to be completely uninformed. Like a good scientist does not select the theories that he tests at random.

[00:12:04]Uh so we don't want to be completely uninformed but it also can't be completely informed. It can't be completely determined. There must be some uncertainty about where the answer lies in order for there to be discovery when you find it.

[00:12:20]So in practice, the variation is always partly informed and partly blind. But it's the blind part that really corresponds to the discovery.

[00:12:31]Okay. So now let's briefly go all the way to modern deep learning to the back propagation algorithm. At first it might seem that back propagation is incapable of discovery as I've defined it because back propagation is deterministic and thus incapable of variation. But this is not entirely correct. Uh the the weight updates of back prop are deterministic but the weights are initialized to small random values and this provides some variation to start with. Now this random initialization process is often downplayed uh but it's in fact a necessary part of backrop and it must be done properly to get good results.

[00:13:18]So in backrop the variation is done once when the network is initialized. So its effect is is only temporary and later the network does tend to lose its ability to learn its plasticity and this is the weakness of deep learning that is alleviated by a new algorithm that my group presented in nature a couple of years ago. Our continual backrop algorithm made one small change that's every so often a less used neuron is reinitialized to small random weights and this allows the variation to continue and plasticity to be retained.

[00:13:59]Okay. Now of course there's much more that could be said about creativity and discovery but but I need to wrap up. So let me emphasize the main point. Uh creativity and discovery they are more than what can be done by supervised learning, more than what can be done by pattern recognition or prediction. Even more than what can be done with world modeling. These things are all really important but they alone will not bring us to discovery. Discovery requires evaluation from a person or from an explicit goal and only in that latter case with an explicit provided goal available to the system will we attain fully autonomous AI.

[00:14:47]So this is my call to arms. If we want the full power of AI scientists, then we should share the goals with them so they can create, evaluate, discover, and in these ways fully participate in achieving the goals. Let's be bold.

[00:15:03]Let's automate creativity and discovery.

Related Videos

Artificial Intelligence

OpenHuman VS Hermes AI: Who Wins?

JulianGoldieSEO

285 views•2026-05-29

Artificial Intelligence

Long-Running Agents — Build an Agent That Never Forgets with Google ADK

suryakunju

142 views•2026-05-30

Artificial Intelligence

5 Mind Blowing Omni Uses Cases

PaulJLipsky

1K views•2026-06-02

Artificial Intelligence

This computer is made from real human brain cells. And you can buy it.

Talktmsmedia

3K views•2026-05-28

Artificial Intelligence

BREAKING: Microsoft’s New Image Generating Model Beat Out GPT 1.5 and Nano Banana 2

aimmediahouse

122 views•2026-06-03

Artificial Intelligence

I Made the Same Anime Fight Scene in Every AI Video Generator

NobleGooseAnime

295 views•2026-05-30

Artificial Intelligence

Nvidia Bets Big On AI PCs | New Chip To Power Windows Laptops | Technology | AI Updates | N18S

cnnnews18

3K views•2026-06-01

Artificial Intelligence

I Tested NEW Opus 4.8 on Four Projects (Updated LLM Leaderboard)

AICodingDaily

298 views•2026-05-29

Trending

Revisiting The Cat Cafe For The Final Time

BenGtalks

3195K views•2026-05-29

Lil bro is a menace 🤣

NotAirJordan

2037K views•2026-05-31

Political Science

My response to the Police

RecklessBen

1496K views•2026-06-01

The Dancing Plague...

HoodieGuyStories

1730K views•2026-05-30