Generative AI has evolved from early concepts like Karel Čapek's 1920 play 'R.U.R.' (which coined the term 'robot') and Alan Turing's 1950 Turing Test, to modern systems using transformer architecture and large language models trained on trillions of words. Modern AI passes the Turing Test with high accuracy (GPT-4.5 judged as human in 73% of cases) but still struggles with fluid intelligence benchmarks like ARC-AGI (Gemini scores only 0.37%). The technology works by training neural networks on massive datasets, using token embeddings to understand word relationships, and employing diffusion models for image generation through iterative denoising processes.
Deep Dive
Prerequisite Knowledge
- No data available.
Where to go next
- No data available.
Deep Dive
The Whole Story Of Generative AI (Broadly)Added:
I can't be the only one who feels like AI just appeared out of nowhere. It's only become mainstream to use in the last few years, but now it's everywhere.
Like last week, I created a new Instagram account, >> [music] >> and scrolling through my feed, I noticed how it took me like 10 posts to see the first caption written by an actual human. Well, how do I know the captions were AI-generated, you might ask, and I've recently been playing a lot with AI tools, and I've become really good at seeing if something is written by a chatbot. And also, mainly because of this symbol.
Now, for anyone who doesn't know what that is, it's called an em dash, and it's heavily used by AIs like ChatGPT, Gemini, and Claude. And fun fact, they will use it even if you tell them not to. And sure, the em dash is not new, and it has existed way before AI. But before AI, I had never seen anybody use it besides news articles and professional writers, and now it's in every caption on social media, along with all the top overused AI buzzwords and perfectly parallel sentence structures. And funnily enough, when researching for this topic, I noticed that a lot of the articles I read and videos I watched were also written and narrated by AIs. And it felt kind of weird seeing AIs write on the topic of history of AI. You know, kind of like we humans write about our own history.
Makes you wonder if the internet truly has become dominated by bot activity, AI-generated content, [music] and algorithmic curation. But anyways, I did not pay any attention to the whole AI thing when it was all new and exciting, because truthfully, I thought it was just another thing that everybody will forget about and move on from. But it kind of looks like I was wrong, and it's here to stay. So, this video is just me trying to educate myself, catch up with where we are right now, and how it all started. So, if you feel like you kind of missed out on the whole beginning of AI and now feel confused to see it everywhere, then join me on this little self-education journey. So, I looked into it, and it kind of looks like the groundwork for artificial intelligence began all the way back in the 1920s. And yes, while the idea of a machine being able to function on its own is sort of ancient, for the purposes of this video, we're only going to focus on the 20th and 21st centuries when engineers and scientists began to make strides toward our modern-day AI. So, let us go back in time for roughly a century. The year is 1920 and a Czech playwright, Karel Čapek, has published a play called Rossom's Universal Robots. The play begins with a girl named Helena visiting a robot factory. The director, Domin, says that robots will change the world and make labor so cheap that all work and poverty will be eliminated. Helena doesn't like this and wants to convince the robots to revolt, but they don't care.
Fast forward 10 years and Helena is still at the factory, now married to Domin, and humans have stopped having babies. Helena is angry about this and burns the formula for making robots.
And then the robots revolt and turn violent. Domin figures that they'll trade the formula for their own safety, but since Helena burned it, the robots decide to kill everyone on Earth except for this one dude named Alquist because he works with his hands and the robots are like impressed that he shows such robot-like qualities. Two of the robots then suddenly gain human emotions and fall in love [music] and Alquist decides that this must mean that the world is saved and he sends them off to produce new life, which makes no [ __ ] sense, but to our friend's credit, he wasn't a biologist, so maybe he doesn't know how babies are made. The humans, after all, stopped reproducing, so maybe the whole making babies was lost knowledge at this point. Yeah, that's 1920s sci-fi horror.
Probably wouldn't make anybody's skin crawl today, right? Well, it did back then because the play was deeply disturbing for Makoto Nishimura, a 40-year-old professor at the Hokkaido Imperial University in the Japanese city of Sapporo. He was concerned about the idea of robots being seen as free labor and said that as humans could benefit from the evolution of artificial ones if they were designed to be inspirational models rather than slaves. And to prove his point, he wanted to build such a robot, which he did and named it Gakutensoku. It was built as a direct response to a company called Westinghouse, who in 1927 had published the Televox, a vaguely human-shaped robot designed to connect telephone calls. Nishimura's own early creation could change its facial expression and move its head and hands. It had a pen in its right hand and a lamp in its left, and perched on top of it was a bird-shaped robot named Kokukyocho. When the bird cried, Gakutensoku's eyes closed and it would assume a thoughtful expression.
After a while, the lamp would light up and it started to write words with the pen. Exactly how it operated has long been a sort of mystery for historians because sometime after its completion, the machine was lost under somewhat [music] mysterious circumstances while touring Germany.
And while a few photos of the original design still exist, they're not publicly available and lack many important technical details anyway. So, while Nishimura showed how machines could physically mimic humans, his robot didn't really show any signs of actual intelligence.
Alan Turing was an English mathematician and a computer scientist who played a pivotal role in cracking intercepted messages helping allies defeat the Axis powers in World War II. He's a man with incredible [music] achievements that covering them all would require a video of its own. But for today, we're going to focus on one of his most famous ideas, the Turing test.
Originally called the Imitation Game, this was a method Turing designed to determine if a machine could truly exhibit intelligent behavior.
Here's how it works. A human judge has a text-based conversation with two hidden participants.
One is a real person, the other is a machine.
The judge's goal is to figure out which one is which. If the judge cannot reliably tell the difference, the machine passes the test. What's fascinating is that accuracy doesn't matter. The machine doesn't have to give correct answers. It has to give human answers. Turing first introduced this concept in 1950 in his paper Computing Machinery and Intelligence. Instead of asking, "Can a machine think?" which is hard to define, he changed the question to "Can a machine imitate a human well enough to fool us?" And now you can [music] answer, "Can it?" since that whole chapter on Alan Turing was completely written by an AI chatbot. And in there lies a problem. AIs sometimes start to hallucinate, meaning giving confident but false or misleading responses, often appearing plausible despite being inaccurate, fabricated, or illogical. Like in this case, the AI presented a paraphrase as if it were Turing's exact wording. Alan Turing did not, in fact, change the question to "Can a machine imitate a human well enough to fool us?" What he actually wrote in Computing Machinery and Intelligence was, "Are there imaginable digital computers which would do well in the imitation game?"
Close enough? Yes, maybe. But this can get dangerous if you trust AI too much.
Since large language models don't actually know facts, instead, they predict the next word based on patterns learned from massive text data. If the training data is sparse or inconsistent, the model may fill in the gaps with something plausible but [music] untrue.
For an example, one time a United States lawyer used ChatGPT to help draft court filings and ended up citing entirely fake legal cases. When the opposing side challenged the citations, the lawyer claimed not to realize that ChatGPT was a generative language tool rather than a reliable legal database. Or one time when Google's AI Overview suggested adding 1/8 of a cup of non-toxic glue to pizza sauce to make the cheese stick better. Apparently, it got this answer from a decade-old Reddit comment from a user called Fox Smith. But indeed, AI has passed various interpretations of the Turing test at different times with the most notable milestones being in 2014 with Eugene Goostman, a chatbot pretending to be a 13-year-old [music] Ukrainian boy convincing 33% of judges that it was human. And more recently, in 2024 with GPT-4, when a study conducted by cognitive scientists at the University of California found that human participants could not reliably distinguish between GPT-4 and a real human doing a 5-minute [music] text-based conversation. And while the 2014 event was heavily debated, modern LLMs are now widely considered to pass with some studies suggesting AI can convince judges it is human more often than actual humans can with [music] GPT-4.5 being judged as human in 73% of cases. Well, if the Turing test doesn't work anymore or is too easy for modern-day AI and therefore cannot be reliably used to judge intelligence, what do we then use [music] to define if an AI truly can exhibit human-like intelligent behavior? Well, a good example used today is the ARC-AGI, the game you're seeing in the background of this video. It's an interactive reasoning benchmark developed by the ARC Prize Foundation, a nonprofit dedicated to accelerating the development of artificial general intelligence or AGI for short. The ARC-AGI challenges AI agents to explore novel environments, acquire goals on the fly, and learn continuously without any prior knowledge or given instructions. A 100% score in this game means that AI agents can beat every level as efficiently as humans, but as of today, even advanced AIs fail with Gemini taking the number one spot scoring 0.37%.
So, at least as of today, [music] AI is still far behind humans in intelligence, or at least in fluid intelligence, which is what this game measures. But, they sure have come a long way, and modern AIs are fantastic tools that can have long [music] and complicated conversations and help us humans with all sorts of tasks from setting an alarm to coding an entire [music] app, or even translating entire live conversations on the fly.
And, they have a very, very humble beginning.
ELIZA, developed by Joseph Weizenbaum at MIT, uh was the first chatbot designed to simulate human conversation. It would be unfair to compare it to something like our modern chatbots, but this was 60 years ago, and it truly amazed the people of its time. Even though its [music] memory was limited to the immediate previous topic, and it could not maintain context over extended conversations, it became famous for a script called Doctor, which mimicked a psychotherapist, allowing ELIZA to reflect words back as questions. It worked by scanning user input for specific keywords like I, sad, and always. ELIZA would then break down the sentence and reassemble it into a question using predefined template, transforming first-person statements [music] into second-person questions. An example conversation with ELIZA would look something like this.
User, my boyfriend [music] made me come here.
ELIZA, your boyfriend made you come here? Question mark.
User, he says I'm depressed.
ELIZA, I am sorry to hear you are depressed. User, it's true, I am unhappy.
ELIZA, do you think coming here will help you not to be unhappy? Question mark. If ELIZA did not find any recognizable keywords, it would just pick a very generic answer to keep the conversation going. This include phrases like, "Please go on." or "That's very interesting." Modern AIs, however, work very differently. Instead of simple pattern matching and keyword substitution rules created by humans, modern AIs utilize neural networks, machine learning, and large language models trained on trillions of words to the point where [music] reading through a modern AI chatbot's training data would take a single human up to a hundred millennia of non-stop reading.
Modern AIs are able to train on this enormous amount of data thanks to a technology called transformers.
Before the introduction of transformers in 2017, training an AI model was a little bit more like training a human.
It had to go through the text one word at a time. And while it didn't learn as efficiently as humans did, it learned in a way similar to humans, which was very slow compared to today's training where instead of reading the sentence word by word, the transformer looks at the entire block of text at once. It then calculates the relationship between all words simultaneously, which allows the work to be spread across thousands of computer chips at the same time. It then breaks the text down into what's called tokens and assigns each token an ID number.
Then the model converts that ID into an embedding, which is a long list of coordinates. This embedding [music] is what allows the AI to understand that the number for a cat is mathematically near the number for a kitten. Now, after the text is transformed into a giant list of numbers, it starts looking for statistical patterns to figure out which words usually go together, like New and York. It can also teach itself by taking a sentence and hiding a random word and predicting it using available data. If it predicts wrong, it adjusts its internal calculations, fine-tuning itself, and by doing this trillions of times across the entire internet, it eventually learns almost everything humans have ever written and gains a very good understanding of which words follow each other, allowing it to write comprehensive text on any topic. But, if it learns from the internet, shouldn't its answers sound more chaotic than, for an example, what ChatGPT sounds like?
Because, in my opinion, it sounds quite polite. And, well, the answer is yes, but after the initial training, companies hire thousands of humans to rank the answers until the chatbot learns to be polite [music] and unbiased. Now, all this training is done at specialized data centers where companies like OpenAI or Google link tens of thousands of GPUs together to create supercomputers costing who knows how many hundreds of millions of dollars that can process billions of words per second, which allows training of massive, state-of-the-art models in just couple of months. Transformers are also responsible for anybody being able to use generative AI to produce low-effort AI content, or what we now call AI slop.
Thanks to the transformer technology, these generative [music] models can efficiently use their training data to create entirely [music] new data in response to input. For an example, when you want to create a picture of a cat using an AI, [music] it starts with just random noise at first. But, when you provide a text prompt like a cat on a beach, it connects the text prompt into the noise through something called contrastive language-image pre-training, which it uses to make sure that the images for a cat and a beach correspond to the embeddings for those words. It can then start the denoising process.
And, >> [music] >> at every single denoising step, the AI asks itself, "Does this current pattern look more like a cat on a beach than it did before. And if it doesn't, it adjusts its noise prediction to move closer to your prompt's meaning.
How the [ __ ] does it know what a cat on a beach looks like? Well, it's trained by taking clear images [music] and gradually adding tiny amounts of random static over thousands of steps, and eventually the image is completely destroyed and has become just pure noise. Then the AI has to reconstruct that same image by clearing tiny amounts [music] of noise step-by-step. And all this training data is then embedded, just like the training data for words is. And after the AI has trained on deconstructing and reassembling an image of a cat enough times, it knows how to navigate those embeddings to reconstruct an image of a cat on a beach because it has embedded the concept of what a cat is and what proportions a cat should have, and what a beach is and what it should generally look like. Okay, that was a lot of information, but that's broadly how AI works. So, why won't you support my work by clicking that like button and Naruto run to the comment section like it's Area 51. Okay, thank you for your time. Goodbye for now.
Related Videos
OpenHuman VS Hermes AI: Who Wins?
JulianGoldieSEO
285 views•2026-05-29
Long-Running Agents — Build an Agent That Never Forgets with Google ADK
suryakunju
142 views•2026-05-30
5 Mind Blowing Omni Uses Cases
PaulJLipsky
1K views•2026-06-02
This computer is made from real human brain cells. And you can buy it.
Talktmsmedia
3K views•2026-05-28
BREAKING: Microsoft’s New Image Generating Model Beat Out GPT 1.5 and Nano Banana 2
aimmediahouse
122 views•2026-06-03
I Made the Same Anime Fight Scene in Every AI Video Generator
NobleGooseAnime
295 views•2026-05-30
Nvidia Bets Big On AI PCs | New Chip To Power Windows Laptops | Technology | AI Updates | N18S
cnnnews18
3K views•2026-06-01
I Tested NEW Opus 4.8 on Four Projects (Updated LLM Leaderboard)
AICodingDaily
298 views•2026-05-29











