Google's Omni model moves beyond the limitations of rigid prompting by treating video generation as a multi-modal conversation, prioritizing creative flexibility over technical friction. It is a significant evolution in workflow that values human intent, even if the final visual output still trails behind specialized high-fidelity tools.
Deep Dive
Prerequisite Knowledge
- No data available.
Where to go next
- No data available.
Deep Dive
Google's New AI Video Model is InsaneAdded:
Some tests from Google's brand new AI video model have leaked and so we're putting them to the test against Seed Dance 2.0. There's a brand new workflow making the rounds online for controlling emotion in your cinematic AI videos. But how does it stack up against really good prompting? And Korea launched a brand new AI platform that promises to give you really nice AI images. But how does it stack up against other tools on the market? Well, we're going to find out all of that and more in this week's episode of AI Film News. Thanks for watching. Now, before we get started, I want to give a hat tip to the Curious Refuge community in Atlanta who recently hosted an AI filmmaking meetup. It looked like a lot of fun, and I think we're going to do that every month from here on out. So, if you're looking for meetups in your area, be sure to check out the meetup section over in the Curious Refuge platform. It's completely free to join. I want to kick things off by talking about the big news from Google. So, every year Google hosts an event called Google IO and typically it's when they announce just really big updates around the company. Now, in this AI era, they basically do announcements all year, but IO is still an inflection point in an exciting place where typically you would expect to see some brand new announcements. Now, we haven't heard any rumors that Google V4 would be announced at the event, but there have been some leaks from an alleged Gemini Omni video model. So, what is an Omni model? Basically, it allows you to upload videos, images, sound directly inside of a chat experience and be more creative in your outputs. You don't have to follow the normal start frame and then animation through prompt system that you probably use in a lot of other AI video tools. A lot of times omni models can have quality that is not quite as good as direct imagetovideo models. And so if Google is able to deliver a really good omni model that has decent fidelity, it would be a very real competitor to tools like seed dance. So there was this video that leaked this last week that allegedly is from Google Omni. And you can see we have this professor writing on the chalkboard. Some aspects of the video are impressive. Other ones, like for example, he draws a whole X with just a single line. It doesn't make a lot of sense. So even with the video that's going viral, you can already see some issues. Not saying that Omni won't be amazing, but it's certainly not perfect, at least in its current form. So I wanted to compare this against a similar generation inside of Seed Dance. So I uploaded a character reference image along with a complex formula and asked C dance to generate a video and here's what we got.
>> V1 equals the square root of 2 GME over RE where G is the gravitational constant, ME is the mass of the Earth and RE is the radius of the Earth.
>> Okay, so you can see at least from this singular test it does seem like Google Omni does a better job than Seed Dance Omni. But only time will tell. We're very excited to hopefully get access to this tool this next week. We were actually looking on Poly Market and uh allegedly there's a 92% chance that it drops on May 19th. So, we will pinpoint that date inside our calendars. Next up, the team at Korea has announced Korea 2, which is an updated platform and a brand new native AI image model that allegedly gives you a better operating environment for pulling off your creative styles.
The entire system really centers around kind of style reference through mood boards where you have the ability to upload your own style reference and generate images in return. The idea is once you curate your mood board, you'll be able to create outputs that are more in line with the specific style from the images and the context that you're applying to the AI system. We've shown time and time again on this channel when you're able to upload images and provide more context on the front end, you have way more control and consistency over your outputs. So to use Korea 2, all you have to do is go to Korea's website.
You'll find a link below. They're giving early access to people who are subscribers to Korea. So first things first, we're going to type in a very quick prompt. You can see we have just a kind of quick little prompt of an eerie mall. Basically, we want to create this abandoned mall shot. And from down here, you can select whichever image model you want to do. I'm going to select create 2 large just to get maximum quality to compare it against other AI tools. Now, you do have the ability to bring in your own image assets for a style transfer.
And then you can also grab your images from your mood boards that you can create to curate the overall style in a certain direction. We'll talk about that in just a little bit. You can kind of think of it as mood boards inside of tools like Midjourney. It works in a very similar way. So when you're ready, go ahead and hit generate. And after a few seconds, we got this image here from Creata 2. And it looks okay. I would say there are some parts of it that feel a little wonky. Kind of like the background looks almost like a matte painting over an actual real image. But the fact that it was able to generate it very quickly is obviously very interesting. Now, I did want to compare this against other tools on the market.
So, this is the actual image that we got from GPT2 with the exact same prompt.
And GPT2 looks, I would say, a bit more realistic, but there is this kind of weird artifacting that's beginning to happen a bit more in GPT2. I'm not entirely sure why, and other image models have done this in the past. The quality is really good at the beginning, and then it slowly degrades over time.
I'm not too sure why and I don't know if that's exactly happening in GPT2, but I have noticed it a bit more over this last week. And here's the same generation inside of Nano Banana 2. And there's nothing about this that feels super realistic. I like the composition, but outside of that, it has that very Nano Banana 2 hyper contrasted look, and the text really is just too perfect.
Again, it looks like someone typed out a font and put it into the image. So, all things considered, Korea seems to be a decent competitor, at least in this first test. Now, I wanted to test it against a few other ideas. So, we prompted for a cinematic wide shot of a group of people dancing in the 1950s with a ton of other prompt language applied. Basically, just trying to get a cinematic still. And inside of Korea 2, we got this image, which looks okay, but then when you start to zoom in on people's faces, there's like a lot of weird stuff happening. So, it does not seem to be doing a very good job at all at fine details. Now, let's compare it against GPT2, which gave us this image.
And yeah, pretty pretty amazing. The details on the characters in the background are really, really good. I wouldn't say it comes across as completely photorealistic. It's almost like an image that was restored, but it is pretty darn realistic all things considered. And this is the same image inside of Nano Banana 2. Again, it feels very nano banana. Not the best generation. GPT2 does seem to be doing a better job. Now, obviously, text to image is interesting, but what's far more practical for cinematic AI workflows, is bringing in reference imagery, whether it's through a mood board or just style references to push your generations in a certain direction.
So, with Create 2, they have this new mood boards feature on the left. And to create a mood board, all you have to do is click the plus icon to create whatever mood board you're looking for.
And now you can upload images that embody the overall style that you're looking for. So for our example, I'm going to bring in some style references from the movie Bladeunner. So the idea is not that we're creating original IP here. We're just doing stylistic exploration just to get creative ideas.
So I'm going to go ahead and drag and drop those images into the new mood board section. And when we're ready to generate our image now, we just go to the image section. And from here, we can type in our prompt, select mood board, and then select your mood board to give your images just kind of a jumping off point in the right direction. And when you're ready, go ahead and hit generate.
So, inside of Korea 2, we got this image here. And there's just something a little off about the people. Like, he just feels very soulless. And her fingers there are kind of messed up. So, it's not doing the best job. like the color grading looks right, but the composition of the characters, it's just it feels very much like a midjourneyesque output. Now, if we compare the same image from GPT2 with the style references, we got this image here, which I think looks really, really good. I think it did an amazing job at capturing the essence of what we're looking for. And then finally, we got this image from Nano Banana 2. Again, it just doesn't come across as super realistic. It's very soft and plasticky.
And a lot of times inside of Nano Banana 2, it will actually generate the likeness of other actors, which is not good if you are working on a cinematic project. For example, when we ran the image through Nano Banana 2, it gave us this image, and that's clearly Ryan Gosling. It gave us this image, Ryan Gosling again. this image, Ryan Gosling.
And then finally, this image, which is Ryan Gosling talking to Amy Adams, which I don't know if that's like a Bladeunner Arrival crossover movie, which uh actually might be kind of cool, but of course, it's not helpful if you're trying to work on an original project.
Now, it probably has to do with the training reference data that was used to train the actual models. Of course, in this example, we were using actual stills from the film as a creative jumping off point. If you wanted to have IP that you could claim ownership of, you would need to use copyright cleared data or images that you have the rights to use to create the overall imagery and style um for your image project. But I just thought that was very fascinating.
So if you're going to these tools to generate style reference images that you can then bring into tools like GPT2 or Nano Banana, then you probably just want to stick with MidJourney because I think it's better at creating an overall cinematic style that feels better. Our studio Promise is going to be at the Can International Film Festival this next week. Dave Clark is speaking alongside Darren Arnowski along with tons of other talented creators from around the space.
We would love to see you at the event.
While I will not personally be there, if you happen to be at that event on May 14th or 15th, be sure to say hello to Dave who is speaking at the AI for Talent Summit. Now, speaking of talent, I do want to let you know that applications for the first round of our distinguished network of artists are closing on May 20th. If you want to apply to be a part of the exclusive network that Promise and the larger creative world goes to for artist recommendations, I highly recommend applying. Even this last week, there were already artists getting connected with jobs who are a part of that program. We would love to have you apply. It's open to any members or students who are a part of the Curious Refuge community. You can click the link below this video to learn more. Next up, we came across a really interesting workflow from a creator on X called Deepon Ratinum who basically showcased the process of uploading a storyboard and then working with Seed Dance 2 to actually generate the individual shots.
Essentially, you upload a panel that has a bunch of individual images in your storyboard and then break down the actual seconds that you want your shots to jump through. You can define that you want shot one to be like this, shot two to be like this, and do all sorts of durationbased prompting to make it happen. Let me show you what I mean. So, I'm inside the Drainia platform, which is the direct integration between Cap Cut and Cedance 2.0. and we need to bring in a reference image that has the individual storyboard cells for the project that we're working on. For our example here, I have this image that was generated inside of GPT2. But of course, you could get as custom as you want with defining what you want your individual shots or the story beats to be like. And we'll go ahead and drag and drop that image into the image reference section.
Now, I should note again I'm using Seed Dance 2.0 with the Omni reference feature. Now, from here, I'm going to paste in the exact prompt, but you can find a link below this video to copy it for yourself. Let me just quickly break it down. Basically, we're saying from 0 to three, we have this specific shot.
From 3 to six, we have this other specific shot. And we're breaking down what we want to see in the final output.
And when you're ready, go ahead and hit generate. And after a few minutes, we get this.
I mean, come on. Obviously, there's some logic that's a little broken, but with just a few generations, you at the very least have an incredible previsualization tool, if not something larger that you can directly bring into your film with a little bit of upresing and finessing. Next up, we came across a really interesting breakdown from a creator called Kota on X. Essentially, the value proposition is you're able to control emotion more precisely from your character by defining two things that you probably don't think about inside of your AI videos. The first is a term called veilance. The second is arousal.
The idea is that whether those settings are high or low will help to define certain aspects of the character performance. Now, I'm not going to get into the entire prompting experience here. You'll find a link below this video to grab the exact prompts for yourself, but let me show you it in action and then we're going to compare it against just naturalistic language prompting. So, I'm hopping over to Runway just to show you that you can use C dance in a lot of different platforms here. And I'm in the video generation section. And for our image, we have this multi-panel image of our character with high veilance and high arousal or low veilance and low arousal. And just different examples of that in action.
And here's just a bunch of emotive states of our character. Again, this is coming from GPD2, which is, I think, the best image generator on the market at the moment. Okay, so that's the image that we're bringing in. And for our prompt, we're keeping it crazy simple.
We're saying a 7-second cinematic close-up of a shot where the veilance is very low, the arousal is very high, and the character is saying, "I've been waiting for this for a long time." And go ahead and hit generate. So, after a few minutes, we got this generation utilizing this technique.
>> I nin I've been waiting for this for a long time.
>> Okay, that's pretty good. There there's emotion there. Obviously, there's kind of a weird stutter at the beginning, but not too bad. Now, I was really curious to see how does this actually stack up whenever you just use like naturalistic language to prompt. And so, I have this prompt here. It's basically the woman looking at the camera says, "I've been waiting for this for a long time. She's very emotional. There's a lot of weight.
Her voice is shaky." You get the idea.
We're just using more humanlike language to create the output. And go ahead and hit generate. And that gives us this shot.
>> I've been waiting for this for a long time.
>> Okay. So I actually think that the naturalistic language prompting did a better job. Let's test this one more time just to see if there's anything here. So again we have our starting image of our woman in this scene here.
We used the veilance and arousal based prompting to create this output.
>> I your see I've been waiting for this for a long time.
>> Okay, that one actually is pretty solid.
I think it did a good job. And let's compare it against the same prompt with a naturalistic language.
>> I've been waiting for this for a long time.
>> Okay, so that one's a little more flat.
So, what would I give as a recommendation in all of this? Well, I think by default, you should try to always default to naturalistic language prompting when you're working with these AI tools. We've tested everything from JSONbased prompting to very specific structures that are not necessarily in line with human language. And almost across the board, there's either an indistinguishable difference between the two or the hassle of trying to pull off the hack is really just not worth it.
And so I think in this case, you might be able to get a better emotive result from changing the prompting and being a little more methodical about it. But you also could just get the same result through going back and forth and using an iterative approach to get the exact performance that you're looking for.
This last week, we came across a really interesting tool from a creator called Tom Likes Robots where they fixed a problem that's kind of common if you're working inside an AI video workflow. To illustrate this, let me hop inside of Premiere here. So, we have this 10-second shot that was generated inside of Seed Dance. Let me go ahead and play this back.
Okay, you get the idea. There's about to be a gunfire and, you know, war scene is going to unfold. Well, how do you actually continue this shot and continue the scene? Up to this point, you would have to take a still image of this character holding up the rifle and then go prompt again, and the motion continuity would just not work. But the good thing is utilizing omni models, you can actually just bring in this actual footage into your AI video tool and then prompt for the scene to continue. For example, inside of Dream Media, we can literally take that very first video clip here and drag and drop it into the reference section and say, "Show me what happens in the video." Use the last frame of the video as the start frame.
And we can define no music. Obviously, we can get into way more information with prompting here. I'm actually just kind of curious to test out the tool.
And let's go ahead and hit generate. And in return, it gives us this video clip here, which all things considered is pretty decent. Now, if we take those two video clips, put them back to back inside a video editing application, it looks like this.
Okay, so that continuity is pretty darn impressive. But if you did notice right at the stitch point, right between the two video clips, there's kind of this change in exposure that happens that makes it kind of feel like the two video clips don't live together. So, how do we fix that? Well, up to this point, you would have to use a tool like Adobe After Effects or Crossfade or some sort of plugin inside a tool like Da Vinci Resolve to smooth everything in between.
But this brand new tool called Seed Stitch does it automatically. And there's a platform online that you can interface with to drag and drop both of your clips and it stitches it together in a pretty realistic way. So, to stitch the video together, it's actually really easy. All you have to do is go to the seed dance to Stitcher website, which is kind of a prototype version of the website right now. And you can see you drag and drop video one, and then drag and drop video number two. And when you're ready, go ahead and click upload and analyze. And after a few minutes, it gives us this clip.
Very very nice. And we also have this example here. See if you can tell when the stitch actually happens.
>> Jacob We also saw new agentic workflows popping up in almost every single tool.
This last week, Runway came out with their Agentic workflow and we also saw that Dradia is also going to introduce their own AI agent directly into their platform. which got me thinking. I have yet to come across an AI agent that is actually more productive and helpful rather than tedious in my AI workflow.
So, I have a challenge to you. The very first person in the comments of this YouTube video that can give a good example of utilizing AI agents in a way that actually saves time and money versus just doing it yourself, we will give you $100. You heard that right.
Just let us know in the comments. We'll test out the workflows and announce the winner next week. Now, speaking of events, there are all sorts of AI filmmaking events popping up all over the world. We have the AI on the lot event happening on May 27th and 28th out here in Los Angeles. If you use the code curious at checkout, you'll get 20% off.
We're also hosting an AI filmmaking meetup in Denver on June 6th. So, if you happen to be in the area, please attend.
And then the upscale conference is happening the first week of June in San Francisco. It's a really good one and Caven who's the instructor of our advanced AI filmmaking course will be speaking at that event. And that brings us to our AI films of the week. The first project that I want to shout out is a film concept created by Marco Slavnik. And it's just a really cute 3D animation about some pigeons. There's a time machine. Basically, it's a very short film that I think is really effective. It's really funny. I think they did an amazing job. And bonus points to the pigeons that have the New York accents. Very, very good job. The next project I want to talk about is called Chapter 3. It has incredible emotional delivery. It utilizes a really good motion transfer style process.
Highly recommend checking it out. And I do want to let you know that you can always check out our gallery over at Curious Refuge for the latest and greatest AI films from around the industry. Thank you so much for watching this week's episode of AI Film News.
Again, May 20th is the deadline to apply for our first round of the DNA program.
If you are a member of Curious Refuge, I highly recommend applying over on the website. And be sure to join us for office hours every Thursday where we have live streams where we answer your questions from around the industry. And then on Tuesdays, we also bring in experts to answer questions, to break down workflows. It's just a really fun session. So, if you are interested, be sure to head over to our website to learn more. Thank you so much for watching this week's episode. We will see you next time.
Related Videos
OpenHuman VS Hermes AI: Who Wins?
JulianGoldieSEO
285 viewsβ’2026-05-29
Long-Running Agents β Build an Agent That Never Forgets with Google ADK
suryakunju
142 viewsβ’2026-05-30
5 Mind Blowing Omni Uses Cases
PaulJLipsky
1K viewsβ’2026-06-02
This computer is made from real human brain cells. And you can buy it.
Talktmsmedia
3K viewsβ’2026-05-28
BREAKING: Microsoftβs New Image Generating Model Beat Out GPT 1.5 and Nano Banana 2
aimmediahouse
122 viewsβ’2026-06-03
I Made the Same Anime Fight Scene in Every AI Video Generator
NobleGooseAnime
295 viewsβ’2026-05-30
Nvidia Bets Big On AI PCs | New Chip To Power Windows Laptops | Technology | AI Updates | N18S
cnnnews18
3K viewsβ’2026-06-01
I Tested NEW Opus 4.8 on Four Projects (Updated LLM Leaderboard)
AICodingDaily
298 viewsβ’2026-05-29











