The transition from mere aesthetic mimicry to genuine structural reasoning marks a significant milestone in generative utility. OpenAI is effectively evolving image synthesis from a creative novelty into a precise tool for functional design.
Deep Dive
Prerequisite Knowledge
- No data available.
Where to go next
- No data available.
Deep Dive
New OpenAI Image-Gen-2 Is Unreal. The OAI Kitchen is HOT!Added:
How's it going everyone? Welcome back to the Matt Vid Pro channel. In today's video, I want to talk all about GPT Image 2. I don't think any AI company out there has released something this capable to our screens. And yeah, it really is sort of the nano banana killer, which is saying a lot. This thing is crazy good. Open AAI just posted this image saying this is not a screenshot. Nope, it is an image generated by Image Gen 2 and it's a perfect MacOSS screenshot. Near perfect.
There will be subtle inconsistencies, little giveaways. I think some of the first things that are going to crop up in your mind as you take a look at the image generations today, all of those hundreds of little subtle details, correct fonts, correct UI layouts, locations, 80 to 90% correct icons. I mean, cookies and cream trash can gives it away. But otherwise, and especially at a first glance, you know exactly what this is. A completely clean screenshot of a chat GPT interface on Mac OS. And yeah, it's loading up the correct chat GPT interface because it just knows what it looks like. The model is up to-date but well informed. You'll notice certain video game styles, obscure websites and art styles will be strongly adhered to by this model. So, not only the chat GPT interface, but the conversation as well, and it can do a much more detailed one than this. This is already a lot of text, but get ready for paragraphs. Is anything interesting happening today?
And then yes, it talks about an OpenAI live stream at 12:00 p.m. PT that is actually happening today. I'm probably posting this video right around when that live stream is happening, but I did get early access to this model and have been messing around with it. Some good news is that everybody is going to get access, even free users are going to have limited generations with increasing limits for other plans. Sam Ultimate posts, really excited for this week. I wanted to keep you guys informed. GPT 5.5, which may be a spud checkpoint, really sounds like it is, according to Greg Brockman, could come later this week along with image generation. So, this might be a really, really hot week for OpenAI. Another reason I say that is because it seems like people with pro plans are getting access to this new 5.5 quote unquote spud model directly inside Chat GPT. Little hidden early access. I ran some tests of my own and I think I might have access. We won't touch too deep on this today because I really want to focus on the image generation, but be on the lookout for later this week. And you can already see the potency of a potential 5.5. This is basically Roller Coaster Tycoon 2 recreated as a web app.
It even has a little mini map so you can see what you're building. Inspector, Pixel Park, Coaster, all of this UI and design, a fully working theme park sim game. Wild stuff. Let's steer this train back on track. I want to light you guys up with a couple of quick demos from the community before we get hands-on in the kitchen ourselves. Proper prompter GPT image 2 is rolling out and wow, it just oneshotted a grid of a 100 completely unique pixel art items. They all even have meaningful labels contained in a single image. Pixel art, I think, is a very interesting style in particular for image generation. It provides a unique set of challenges. There are specialized models, but obviously Nano Banana 2 from Google, for example, and now this new GPT image gen 2 by OpenAI excel in this area in particular and can even do 100 unique items at once, which is pretty breathtaking. Guys, it isn't a far shot at all to say that you could actually use these for a real game converted into real pixel art. There are already plenty of AI assisted tools out there to take generated images with slight pixel art inconsistencies, you know, a wonky pixel here or there, and completely transform them into correctly sized sprites. The real astonishing part is the labeling of 1 through 100 and all of the labels as well. Not to mention how unique each individual piece is, the completely straight lines. I mean, some of these definitely are related, but in theory, this would be for a video game where you're going to have similar items. Your character is going to have different helmets, different keys, weapons, etc. Definitely a little bit of weirdness going on with the crossbows here. Dragon shield is a little mushy, but I think these swords are remarkably straight.
Potion bottles, an orb of power, charm of luck. You can see we're not far off at all from generating a hundred almost usable out-of-the-box assets for a game at once. In the pretty near future, I think that LLM coding models are going to be able to generate assets like this right off the gate and use them in projects innately. That is definitely going to open up a few doors. User Alex on Reddit shows off this example.
Similar to the last one, we're generating a wide range of different images and they're all labeled up through a 100. These ones even having their own categorizations and color codings. I think this is supposed to show all of the various technological areas. Maybe you could use AI inside of.
Regardless, really cool to see this isn't pixel art so much as like stock imagery kind of combined with a graphic or chart, but I'm not seeing any incorrect text here at all. And most of these images are pretty great. Even this ASIC is almost perfect here spelled correctly. Advanced prosthetics. Yeah, this robot arm is a little wonky. I'm struggling to find serious problems or issues with this. It is so astonishing the level of complexity we're generating images with now. You know, guys, it was like what, four years ago, maybe less than four years ago. We were generating images that were actually the resolution of one of these tiny images and had worse quality. We've come so far, so fast. This right here is from Jasper.
The fact GPT image v2 can do transparent images makes it miles ahead of Nano Banana. I agree with this if this is actually true. I haven't tested this for myself. I've never seen an image generator do natively transparent imagery like that, unless it's vector graphics. But yeah, to be able to do perfect transparent icons like that right out of the gate and in a wide variety of resolutions would be super potent. I can confirm the wide range of resolutions. Apparently, it can do like over 8,000 aspect ratios or something like that. So, it's like super variable in terms of real resolution and aspect ratio. All right, guys. I think it's time we started to fiddle with this model ourselves, get hands- on, press all the buttons. But before we do that, I've got a quick word from today's sponsor. Today's video is sponsored by Verda, formerly Data Crunch. With how fast AI is moving right now, compute becomes the biggest bottleneck. not just getting access to GPUs but getting access to the right infrastructure without getting wrecked by traditional cloud pricing. Now that is where our friends from Verta come in. Verta is a European AI cloud provider built specifically for AI workloads and this is not generalpurpose cloud computing trying to stretch itself awkwardly into AI. Their infrastructure is designed for both largecale training and high concurrency inference directly from the ground up. When it comes to Nvidia hardware, they offer a wide scope to choose from. V100's up through B200 and B300 systems along with NVLink GPU VMs and Infiniband clusters for those heavier workloads. Verda is now an official Nvidia preferred partner, which means tighter alignment with the newest Nvidia stack. They've even opened early access for the latest GB300 NVL72 clusters. Another thing I like is that they're not just infrastructure, they also run in-house AI R&D. One of the biggest advantages is cost. With Vera, you can get the same elite NVIDIA hardware, and they claim up to 90% less than traditional hyperscalers. And you're getting European hosted advanced infrastructure with GDPR compliance, data centers being powered by 100% renewable energy. If you want serious AI compute without legacy cloud overhead, check out Verta with the link down below. MVP viewers get $50 in free trial credits if you sign up with my link and use the code. Thanks again to Verda for sponsoring today's video. Now, back to your regularly scheduled content.
Welcome back, folks. All right, so there's all kinds of silly nonsense you can get up to with this model. A lot of the AMB testing images that were floating around were screenshot focused because it could adhere so well to those more unique styles that image gen typically fails at. So, GTA 5 screenshot, that is perfectly convincing. That's something I've been using. Perfectly believable, perfectly convincing. It gives these image generation models a very clean target and just like two tokens. Trevor is riding a T-Rex. So, obviously wanted to adhere to that specific character. He's chomping down your playing as Michael.
And for some reason, Arthur Morgan from RDR2 is in a silly vehicle. And of course, the generation does not disappoint. You saw the other images.
Since we're in ChachiPT, we are somewhat limited in terms of resolution, but I think this looks awesome. Trevor's got his classic stained shirt on. The hair is very close and facial features. Maybe a little washed out. His hands certainly are losing a little bit of detail. The T-Rex looks awesome. It's definitely in GTA style. Maybe a little bit more detail, but he's got his little arms, his legs stomping around, obviously destroying things here in the city. all these blown up exploded vehicles. The map UI even down and correct. Although this is obviously not a real location on the city and there's some blobiness, but it looks relatively accurate. Arthur Morgan in his silly vehicle. He definitely looks accurate in terms of clothes, his hands, his hat, even his facial features, but his eyes definitely lost a little bit to the resolution there. A lot of those finer details you're going to notice in a hyper complex scene sometimes get zapped away or lost in the midst. For example, if you try this same exact prompt in Nano Banana 2, this is the type of image you're going to get. It doesn't feel like a screenshot at all. From the lighting to the various art styles and the way things are portrayed, there's a 2D composite type of feel going on here.
It's getting a lot right, but definitely, I think, worse than the last image we saw. the map's in the wrong spot. Although, it does look decent.
Maybe some of the facial details are more recognizable, but it's just because the characters are closer up to the actual camera. It just doesn't give off the same feeling. It doesn't have that screenshot believable feel. It's more Photoshop, more fan art. I did more video game screenshots. This is for the legendary Starfy, but reimagined as a brand new game on the latest consoles, so better graphics. I really like how it approached this. If you've ever played the original Starfy games, the UI design and actually the gameplay screenshot itself gives off a very exciting and professional feel to it. It just feels very real. I am thoroughly convinced by what my eyes are seeing. Even just down to the font choices and the subtleties in the UI design. Switch gears and try something realistic. A poster. A movie about crabs and how we are all destined to be one. But it's a serious documentary completely rationalized out.
It looks epic. Somehow you want to see it. That's the tricky part the model had to pull off. It actually reasoned and thought about this for 2 minutes. This is definitely a reasoning image generation model. Very similar to the nano banana setup inside of Google Gemini. But I thought 2 minutes was a pretty long time to reason and think about this prompt. And yet the image it produced is well thought out. It has a lot of text and I think no doubt about it is accurate to the prompt. Evolution keeps arriving at the same answer.
Carsonization, a documentary event. And we got the crab there. Very, very dramatic poster. Even Accurate Text Title Cycle Pictures presents a New Terrain documentary in association with Depth Mark Media. Original score by Hayden Bosworth, edited by Lena Morgan.
And it goes on and on. I don't know if these are real people. I kind of hope not. I don't think Chat GPT would use real people. It probably just make people up. Title: Cycle Pictures with its own logo and new terrain documentary. Very funny. It definitely captured exactly what I want, like a completely serious and believable poster, but about something totally ridiculous. And yet still, the image generation model is so good. I want to watch this movie. Just look at that epic crab shot and the subtle Earth or planet behind it. For some reason, all my prompts made this thing set in reason for like a minute. Real photo of the grand opening to the weird mechanism store. Initial hit to weird inventors and doohickey lovers everywhere. Now lost in the before times. believable contraptions. But you can go look and see it did thinking traces for a minute to get this accurate late 90s or early 2000s. I didn't have to specify it, but that is sort of what I wanted because it's lost in the past. The photo it produces completely realistic and like slightly old-fashioned, old school. This model does kind of have an HDR cloudiness effect to it almost that's really subtle. You can see it in the balloons here. You can see it also up kind of on this wall or on this building. I get that it's supposed to be an old dirty building, but I've been noticing it in a lot of the images by this model. There's just like this kind of like hazy cloudiness almost, but it's really really subtle and it's only in the fine grain shadows in the areas you don't really look at too closely. So weird mechanism. Love the strange Wonkaike almost title banner across it.
Obviously grand opening. You can even see this little sign here that's super tiny, only a small amount of pixels, come in and see a weird mechanism in action. It really knows how to make a character play its part. Well, really visually bring them to life. These guys do have similar shirts and similar faces. They're both kind of wearing glasses, although they are different.
They could just be brothers. The rest of the people do appear to be entirely unique. And of course, the main thing down here is all of these kind of homemade contraptions. brass, wood, different just doohickeys and weird little thingies. They look like they have functionality and that you could click clack and you could press them and use them, but obviously the AI has completely hallucinated and made them up. This model is really good with color. Like I said before, subtleties, those small differences, adding a little bit of grain here, a little bit of additional contrast. It's the small things that really make a big difference sometimes and lead you away from a more stereotypical and goofy slop feeling model. Of course, Nano Banana still being really great, but it's just going to start to feel worse as we get better and better models such as this one. We think we hit the peak. We haven't. Then I got devilishly simple Image Gen, the biblically accurate angel of plumbing.
That's all I'm going to ask for.
biblically accurate angel is a very potent little set of tokens because portraying that concept especially visually is usually infinite or high complexity so much that it's overwhelming and then I can just attach that to something mundane like plumbing and get a crazy image gen result from a reasoning model like this. We'll zoom up real close so you guys can get a good look at the detail we expect from a biblically accurate angel typically a lot of eyeballs. That's how it's portrayed most of the time. So, we definitely have those, but they're surrounded in brass and fittings, and all kinds of plumbing work, spouts, nozzles, gauges, and things. It's complex, but it's all interwoven. I love all of the connection work. With something of this high detail, you can see a lot of the gauges aren't really legible. It's kind of mushified in some ways, in some areas. But, I'm pretty happy with this. It's punchy. It's shiny. It's realistic. The text is accurate. And I love the background as well. Lord of leaks, keeper of the flow, fixer of all things. And I really like at the bottom that it has like a sink drainage thing. Just as if you're really looking under the sink. The whirlpool down there and the shiny bathroom and a toilet over here as well. Yeah, lots of little intricate details this thing does. Divine service since forever. Holy plumbing. It's easy to find the failures because we're so far from perfection.
And yet, we've come so far in terms of producing an image that's just fun, entertaining to look at, or even in a lot of cases actually useful. I also tried the same prompt with vibe coding, but I asked it to include a bunch of little Easter eggs and hide additional things like make a full masterpiece, you know, four main takeaways visually, which was able to actually distinguish out. So, the biblically accurate angel of vibe coding. You can see a slash imagagine prompt up there. A vibe, priority, ship and learn. Definitely including all those little quips.
Direction matters more than specification. A northstar beats in a 100page don. Vision build else.
Overengineer. It's like mixing code with all of these almost biblical phrases people use when vibe coding. Trust the vibe. Verify the output. Feel it. Build it. Test it. Ship it. Intuition and tests. Deploy and pray. 404. Taste is the final compiler. You are the human in the loop. Oh, here's another little Easter egg. It hid works on my machine.
Sticky note. Very funny. Also has a bunch of eyes. This one's more gold and like swinging panel. I asked for a photo collection of 30 labeled intricate min unique bots, androids, automatons. They have their own purpose. It thought for about 4 minutes on this one, and it's actually able to provide you with a slide reel up to three here. I don't know how many it's allowed to generate at once, but you can actually see it do one and then do another and do the last one to actually complete the full 30.
Pretty crazy. 10 on each image. So, that's a very agentic image generation behavior. I was pleasantly surprised. It can also generate an image, realize that it's not good enough, scrap it, and then make you another one until it gets to the perfect one. Very, very intriguing behavior. You could take a look at these miniature bont collections, but yeah, they're all pretty awesome, but there is some real problem solving going on when it does prompts like this. These are highly complex and it's bringing things to life. Definitely cool stuff. These all used to be like stable diffusion level generations themselves or like good stable diffusion ones because most of them actually have text. Now it's like here's 10 of them at once all labeled and you know what? Here's the the images agentically one after another. Next up, I tried some more cursed and deep fried memes. These went okay. Sometimes it just takes things a little bit too far. Not as good as those simple complete brain rot type things you'd get from the previous Chad GPT image gen, but still definitely capable of making something that is a hot disaster. A deep fried shocked cat at work. Check out this perfectly believable original iOS 6 launch keynote image generation. this thought for about a minute and 40 seconds. And here is our output. Yeah, definitely has that iOS 6 icon feel and presentatory look that is not Steve Jobs perfectly, but it definitely resembles him and he's got the right fit. Just ask Siri built into photos available this fall. Create beautiful images right on your phone.
Can even make cat photos. That really is how I think it would be presented. I asked for an edit as well. Make the next logical image. press calling it bold.
He's got the Apple logo with a cup fully sipping with the turtleneck and again very similar dude. It's not Steve Jobs.
It's someone full stop single image gen infographics. This thing nails them. As good if not better than Nano Banana 2, Nano Banana Pro. A day in the life of toilet bacterium taking you through the entire process. Our resident a hearty rod-shaped bacterium invisible to the naked eye but perfectly adapted to life in your toilet. uh kind of gets into detail. It's a little bit gross, but you know, I wanted to do something a little bit more unique. Definitely not in the training data. The nutrient pulse, midday competition. You can actually go through and read each individual little blurb if you want. It's kind of interesting, but uh there's a lot of information here. It's smart. It works hard. And when you ask for an infographic, it's going to give you something you could really put up on on the wall as a poster, fully readable.
Another one just toying around with that aspect ratio. This is a day in the life of spring fungi. Seasonal cues meeting the fungi on one side and then the other side taking care of the actual time.
Dawn, morning, midday, each with correct blurbs, realistic images that kind of portray what we're looking for. And of course, I asked for extra Easter eggs, too, which at the bottom you can find a peeking gnome, mushroom-shaped cloud, acorn with glasses, all these different little things hidden inside of the infographic. It's just not really making too many mistakes at all. And this is not an easy prompt. It's a full educational piece in one image. We're starting to run out of resolution when you get into complexity like this. But taking it up to 4K is even more mind-blowing. Especially from the demonstrations that I was shown, it's just all it's a whole new world with AI technology. It really is. Just even those little subtle things like completely straight with this little wind icon. It's almost meaningless. But like that is really really correct. It's almost pixel perfect in some scenarios.
There are, like I said, and I keep saying this, always these subtle little mistakes, but let's try image editing.
Here is my actual cat. Take this pure white little dude and make him an alien.
He's kind of already an alien. Thought for 50 seconds on that. And yeah, it actually produced a kind of gross looking detailed alien, but it like kept the cat's main features and a lot of the important information about the original photo. It changed a lot though. Like the contrast is severely bumped up. It's more vibrant. My arm is I don't know like gross almost. It looks poisoned.
Maybe I'm catching a disease from my alien pet. It kept his body like pretty much the same but really changed his face. Like his eyes are massive and alienesque. He's kind of squinched up with his little facial nose. He's got two little glowing antenni which I kind of find cute. More pointed bigger ears.
glowing little bioluminescent belly, iridescent green shimmery fur. Pretty awesome, honestly. Like I I like this. I think it's a little bit messed up kind of where the paws are in the facees, but yeah. I mean, just for turning him into an alien, this is a smart, intelligent, fun, like oo I didn't think it was going to be that good, that detailed. I didn't think it was going to work that hard on something so simple. Accomplished.
People are going to love this thing, man. I I I really think people are going to have a great time with this model.
It's pretty great for image editing. Oh, you know what? That's right. We did want to try transparent images. Transparent sprite image for a lemon time machine.
Sure. Let's see what it does. Here is our image generation result. And if we look, yep, that is not a real transparent image. So, I'm sure you could cut this out, of course, with tools, but it would genuinely be easier to just generate a pure white background. We'll try it again. Honestly guys, when concerning overall image generation time, it can be a varied result. It depends on how complex your image is. If it actually has to do a lot of writing and really sit down and think and reason about what it's making, it could take several minutes, which is pretty insane. Kind of a long time to wait, but you get a result most of the time that is better than anything else out there right now. It's worth waiting for. Okay, here is our other transparent test. Diet Dr. health. I actually kind of like this. Uh, again, not really transparent. So, I'm not seeing genuine true transparency. If this is something that can be enabled in the API, oh, I would love to see it. But, uh, for now, I'm not going to report and say that it can do transparent images. I don't think so. Let's try messing with aspect ratio.
I think it's understanding what I asked for in the prompt. You can see what it's doing. It's doing the carousel of multiple image generations. Here is the first one. And it actually did seem to get to 100. They're side by side though, not in the complete straight line that I asked for, which might not even be possible. I don't know if it can do an aspect ratio that crazy. Yeah, guys, here is the image generation result. You can see how varied they can be. It's really great to see something that feels so raw in a sense on release, although I really don't think that's what it is.
I'm sure this thing is pretty safety fine-tuned and decently heavily fine-tuned, but really this is an architectural change potentially and a jump up from the previous in terms of output variety and ability to scale to all kinds of different unique places.
It's latent space just has more depth, more shapes, more things to show you, and more ways to accomplish what it could already do before. treating the image generation pixel space like its own little operating system. You can see it messed up. So wait, do we have more than 100 technically? 59 60 61 62. Wow.
Oh, and then 68 messed up as well. So it definitely started to struggle a little bit. Oh man. Wow. This one is a little messed up. And you can see not too much detail on our individuals. It tried its best. Like this guy is definitely like a death with a scythe. Pink wings.
superhero type things. A mermaid. Let's see here. Let's see. A bumblebee costume. Panda bear. H. You can see it messed up here, too. Yeah.
Theoretically, it's supposed to selfcorrect for this sort of thing as well, but I have a feeling that these numbers are too small for the model to like generate and then read and go, "Oh, I need to fix that." It just kind of says, "Oh, I think it's okay." Jumping on over, I'm going to show you some more community tests and generations. And finally, we'll end off with OpenAI Spud Cheddar showing image gen V2 passing another test that no other AI image gen has been able to succeed. Highresolution map of Earth, but with the land and water inverted, meaning all continents are made of blue water and all oceans are solid green land masses. This is something easy for us to kind of imagine and envision in our minds because it's like, oh, all the water suddenly becomes green grass. But for AI, since this is something super out of the norm for the training data, it really struggles. But this new thinking model image gen 2 definitely is able to nail it. Great texture for the land as well, showing various mountains and things and basically just completely inverting everything. It's a It's kind of trippy and weird to look at, but there you go.
Also another test by Chedda image V2 keyboard where each letter on the keyboard is is represented by a creature from the animal kingdom and completely succeeding this as well. Pretty darn fire, man. So much text you wouldn't believe. And it's also very capable multilingually as well. I saw some Chinese demonstrations that were just as complex as this New York Times readable and legible newspaper article about GPT Image 2 being released. But yeah, you can actually go through, read the articles, check out what the latest news is on a completely fake photograph of the newspaper generated by AI. Wild times we're living in. Chedda also showing off that image is V2 solving a QR problem previously unresolved. We asked for a QR code that scans to the URL Wikipedia. The QR code must actually be valid and functional. This actually works. So somewhere in the training data, this thing is intelligent enough to pull that information of a real QR code out and accurately display it. So apparently if you scan this Oh, good grief. Yep. It actually does indeed take you to Wikip freakingedia. That is nuts.
The knowledge is being crystallized in these weights. Eventually some of it could become outdated such as this. But a lot of it is those subtle things you would never think to remember in the front of your mind. Or in this case, you wouldn't physically be able to do remember a QR code by hand. I don't think anyone's really doing that unless they have a photographic memory. Flowers uploaded some images that are a little bit more creepy, a little bit more unnerving, unsettling, an obscure truth about Open AI that nobody will understand. They act like they don't see it, but the seventh echo always folds before sunrise. Their reduct path. Oh man, creepy. This model just kind of telepathically almost gets vibes like this so easily and reasons through your prompt, producing an image that is way more detailed than even what you could ask for, which could be as short as a small sentence. They filled the sandbox but forgot about the hinge. Those wjacks obviously and all kinds of crazy internet culture from my Discord server, which you should join by the way. AI creative Rorow generated this magazine cover. Crystal Sparkle caught dating a hot dog man. It is a pretty censored model in terms of being able to obviously use real people's likenesses or copyrighted works. But there are going to be jailbreaks and ways around that. And honestly, for a model this potent, I'm kind of okay with the safety. This thing's already going to be tricking people in crazy ways, as AI has been doing for a few years now, and people with a computer and talent have been doing for decades at this point.
But yes, easier than ever before.
Another user, Pieruno, generated this screenshot of what I believe is supposed to be Counterstrike. Gun looks pretty familiar. This $600 is looking super familiar along with this UI down here and some of the UI at the top. Of course, Sam Alultman advertising any number of things. The chat GPT Bzner limited edition. Nuts. It looks like people are already getting access to Image Gen 2. The rollout has already begun and maybe by 3 pm it will have rolled out to all users. Not too sure.
OpenAI Pro Vision test. According to Greg Brockman, 5.5 is the newly trained model and is the initial checkpoint for Spud. This is the economy mover model.
Very efficient. It's supposed to find a red stocking in this image and mark it.
And apparently it was able to pass this very very difficult test that a human eye would struggle with off the bat.
Chedda also showing this thing generating SVGs of Xbox controllers that have plethora of detail and doing assie art that is like realistic almost. Yes, Bud is cooking hard and Ched is showing this off with early access through the pro plan of chat GPT. I also have ProPlan and after running a few tests of my own, I realized I think I also have early access to this GPT 5.5 Spud model that might be releasing later this week based off of what we earlier heard from Sam. First thing I generated is this Ridgger game. This is a driving physics simulation. This is the type of thing that's really hard to even get working first try.
But I must say it really seems good enough to just make stuff like this working right out of the gate. It's really complicated. I mean, there's real car physics here, relatively speaking, and transfer of weight. There's also NPC cars and a mini map. The camera is a little wonky. The controls are definitely a little wonky. It's very hard to control.
But it is actually like the most playable version of this prompt that I have ever been able to get from a model.
Regardless, that bump is more marginal.
What's really crazy is that it's doing it in two to threex less amount of time.
So, previously with GPT Pro, I would get a result that's 10% 15% worse than what you just saw and it would take 80 minutes. Now, 25 minutes and we get something like that. It is definitely like three times as efficient. Totally crazy. It generates text fast. I mean, look at this Minecraft clone I had it build me. This is Blink Block 3D. I wanted something kind of creepy and weird. Uh, but yeah, this is Minecraft, but like all of the blocks and everything is alive. The controls again, like simple stuff is messed up. The controls are reversed for going sideways, but we've got a full house here. I can actually use a pickaxe and mine different blocks. And you can see they are all alive and blinking. And even when I hover over a block, loyal planks are happy to become a wall with opinions. There's food. I'm supposed to be building a shelter. There even seem to be like pig mobs. It's It's doing a lot here. I mean, compared to the previous Minecraft clones, pretty mindblowing. There's even a day and night cycle, I think. So, yeah, expect crazy things to come regarding Spud.
More tests coming soon. This is just a peak at what you can expect hopefully later this week. It looks like I got to go, guys. Uh, a blinking unfolded from the dark and the trees are filing a zoning complaint. Oh, purple grimace monsters AI progress this week is looking like it's going to be a wild firecracker. And OpenAI is leading the show. My brain is just filled with all kinds of crazy ideas and jumbled nightmare fuel that I never was able to try before with lesser models. GPT Image Gen 2 and hopefully 5.5 Spud this week.
If you like the video, please consider subscribing and I'll see you guys in the next one. Thanks for watching and goodbye.
Related Videos
OpenHuman VS Hermes AI: Who Wins?
JulianGoldieSEO
285 views•2026-05-29
Long-Running Agents — Build an Agent That Never Forgets with Google ADK
suryakunju
142 views•2026-05-30
5 Mind Blowing Omni Uses Cases
PaulJLipsky
1K views•2026-06-02
This computer is made from real human brain cells. And you can buy it.
Talktmsmedia
3K views•2026-05-28
BREAKING: Microsoft’s New Image Generating Model Beat Out GPT 1.5 and Nano Banana 2
aimmediahouse
122 views•2026-06-03
I Made the Same Anime Fight Scene in Every AI Video Generator
NobleGooseAnime
295 views•2026-05-30
Nvidia Bets Big On AI PCs | New Chip To Power Windows Laptops | Technology | AI Updates | N18S
cnnnews18
3K views•2026-06-01
I Tested NEW Opus 4.8 on Four Projects (Updated LLM Leaderboard)
AICodingDaily
298 views•2026-05-29











