Install our extension to search inside any video instantly.

Gemini 3.5 Flash Is Getting ROASTED (And Nobody's Talking About the Best Thing They Shipped)
Added: 2026-05-23

552 views2616:51BitBiasedAIOriginal Release: 2026-05-22

Google’s massive price hike for marginal gains proves that benchmarks are meaningless without economic viability. True innovation lies in practical utility, not in overcharging for incremental updates.

[00:00:00]You watched the Google IO 2026 keynote, and now you're wondering if Gemini 3.5 flash is actually worth switching to, or if it's just another shiny announcement that falls apart the moment you use it.

[00:00:11]I spent the last few days digging through the benchmarks, the developer forums, and the early hands-on tests, and I found something surprising. The model Google is hyping hardest is the least loved by the community, and the one nobody's talking about might be the best thing they shipped all year.

[00:00:26]Welcome back to bitbiased.ai, where we do the research so you don't have to. Join our community of AI enthusiasts with our free weekly newsletter. Click the link in the description below to subscribe. You will get the key AI news, tools, and learning resources to stay ahead.

[00:00:43]In this video, I'll walk you through every major AI thing Google announced at IO 2026. What's hyped, what works, and what to skip. We're starting with the one everyone's arguing about, Gemini 3.5 flash, because before we get to what works, we need to talk about why the model Google put center stage is getting absolutely roasted. Context and background. If you had to summarize the entire IO keynote in one word, it would be agents, not chatbots. Things that take action for you, drafting emails, writing code, editing video, doing your shopping. Gemini 3.5 flash is the engine, and Google is using it to slip AI into every corner of your day. Your inbox, your docs, your search bar, your shopping cart, even your face, eventually, through new smart glasses.

[00:01:28]Last year Gemini was a chatbot you typed into. Now it's pitched as the thing that does the work for you. Whether it actually delivers, that's what we're about to unpack, starting with the engine, because if the engine wobbles, the whole car wobbles with it. Gemini 3.5 flash. On paper, Gemini 3.5 flash looks like a slam dunk. 1 million token context window. Feed it a small novel, and it keeps the whole thing in mind. Up to 65,000 tokens of output in a single response. Full multimodal input, text, image, video, audio. Four thinking modes, minimal, low, medium, high. That let you trade speed for reasoning depth.

[00:02:06]And thought preservation that keeps the model's reasoning state across turns. So it doesn't forget what it was doing mid-workflow. The benchmarks back the hype, too. 76.2% on Terminal Bench 2.1.

[00:02:18]A GDP val Elo of 1,656.

[00:02:22]83.6% on MCP Atlas. Artificial Analysis ran independent tests and confirmed an intelligence index of 55. A 9-point jump over the previous Flash. The speed claims are wild. Four times faster than comparable Frontier models. 12 times faster running inside Antigravity. With independent tests clocking over 280 output tokens per second. So if you're reading the spec sheet, this should be a slam dunk. But here's where it gets interesting. The moment I started reading what actual developers were saying, the mood completely flipped.

[00:02:53]About 60% of early community reactions are negative. Only 20% are positive. And the complaints circle back to two words, cost and quality. The cost story hits hardest. A Reddit thread that blew up broke down the math. Flash 3.5 scored slightly lower than Gemini 3.1 Pro on a head-to-head intelligence test. But costs about 5.5 times more than the previous Flash on the same benchmark suite. Roughly $1,552 versus $892 for the same workload. So if you were using older Flash because it was cheap, you're now paying premium prices for a marginal, sometimes negative, quality bump. The quality side is just as messy. One user wrote that responses feel less creative and the model loses context easily. Another developer reverted to older models for coding entirely because Flash kept producing suggestions they called useless. Someone on r/singularity called the Flash line a hallucination machine gun. Fast, sure, but firing nonsense at record speed. In fairness, not everyone's mad. An AI blogger used flash inside anti-gravity to convert a web UI into a working API endpoint and said it delivered a functional result without much fuss. A vibe coder on Reddit got it to build a payment app on the first try.

[00:04:05]So, a pattern emerges fast. For well-specified, contained tasks, this model shines. For open-ended creative work or nuanced reasoning, it stumbles.

[00:04:14]Google delayed the 3.5 Pro variant for a reason. More tuning is coming. The practical takeaway heavy users, watch your token spend before you commit.

[00:04:22]Casual users, the older tiers might serve you better right now. Not the takeaway Google wanted from this launch.

[00:04:28]Drop a comment. Team flash, or have you already gone back to your old model?

[00:04:32]Because the next thing we're going to talk about got the opposite reception.

[00:04:35]Same keynote, completely different vibe.

[00:04:37]Gemini Omni. That announcement is Gemini Omni, and this is the one that surprised me. While 3.5 flash is getting roasted, Omni is getting almost universal praise.

[00:04:48]Same company, same keynote, same week, completely different reception. Here's what Omni is. Google's multimodal generation and editing model. And the headline use case is video editing through conversation. You feed it footage, images, audio, whatever, and you literally chat with it to generate or edit clips.

[00:05:06]No timeline scrubbing, no keyframes, no After Effects rabbit hole. You say, "Make the mirror ripple into liquid."

[00:05:13]Or, "Swap the background to a snowy mountain." And it does it. Characters stay consistent. Physics hold up. The demo Google ran had a person walking past a sculpture, and on command, the sculpture turned into bubbles. Character identity preserved, motion preserved.

[00:05:27]That's an edit that would have taken a VFX artist a full day in a traditional pipeline. And here's the part that hit me. An AI journalist tested Omni by animating his kids' drawing into a short video and called it a resounding success. Most impressive demo of the entire IO. The character was consistent, the scene matched the prompt, and it just worked. If you've spent any time with AI video tools, you know that is not the normal experience. The normal experience is uncanny faces, glitchy hands, and physics taking a coffee break. There's also a trust layer baked in. Every Omni output carries a SynthID watermark, so you can tell it's AI-generated. Given how loud the deepfake conversation is getting, that's a meaningful move. The catch is, it's still rolling out with broader API access coming soon, which in press release language means anywhere from 6 weeks to never. And push it hard, you'll still see artifacts. Weird hand movements, occasional scenes that drift off prompt. It's not magic. It's just closer to magic than what we had a year ago.

[00:06:24]Which leads us straight into the other tool that's actually working. And ironically, it runs on the same flash model everyone's complaining about.

[00:06:31]Google anti-gravity 2.0. That tool is anti-gravity 2.0, and it's the perfect example of how 3.5 flash shines when you give it a focused, well-specified task.

[00:06:43]Last year's launch was Google's answer to Cursor, an agentic coding environment. Version 2.0 is a serious step up. New desktop app, native voice commands, an SDK for building custom agents. And under the hood, it's running on Gemini 3.5 flash, which is why the coding chats feel noticeably snappier.

[00:07:01]But the real headline isn't speed, it's agent orchestration. The desktop app now lets you spawn and coordinate multiple sub-agents, chain them, schedule them, run them in parallel.

[00:07:13]Google demoed building a working operating system by splitting the work across 93 agents running simultaneously.

[00:07:18]Not a normal use case, but it's a flex that shows the architecture is real.

[00:07:22]Developer reaction has been bullish, and the reasons are practical. Voice commands, saying "Create a new microservice" out loud, feel like a real glimpse of where IDEs are heading.

[00:07:32]There's a new slash command called {slash}grill-me that aggressively interrogates your requirements before writing a single line of code. One One compared it to Claude's clarifying questions, and called it the feature they'd test most.

[00:07:44]That kind of small touch tells you the product team is actually using their own tool. It's not all rosy.

[00:07:50]The naming is confusing. There's Gemini CLI being phased out for anti-gravity CLI. And GitHub commenters are not happy about the migration. Pricing drew grumbles, too. A $100 per month AI Ultra plan for heavy users, which hobbyist devs find steep. Fair point, but every major lab is heading that direction.

[00:08:09]When developers actually tested it, it worked. The Chat PRD blogger had anti-gravity rewrite a website admin UI into a programmatic API using flash, and reported it delivered a functional result without much fuss. So, compared to last year's version, this is a real productivity boost, if the agent model fits your workflow. Comment below which IDE you're actually using right now.

[00:08:30]Cursor, anti-gravity, Claude Code, something else? The landscape is shifting fast. And speaking of tools still settling into their role, that brings us to maybe the most ambitious announcement of the entire keynote, Gemini Spark. That's Gemini Spark, and it's the most ambitious thing they showed because it asks the most of you in return. The pitch is simple. Your personal AI agent, 24/7, running in the cloud, plugged into Gmail, Docs, Drive, and Calendar. Pichai's framing, it runs on dedicated cloud VMs, so you don't need to keep your laptop open. Spark keeps working in the background. Wired called this Google's answer to OpenAI's agent products and Anthropic's Cogram.

[00:09:10]But Google has a structural advantage nobody else has.

[00:09:13]Your data already lives in their ecosystem. So, Spark can read your inbox, scan your docs, check your calendar, and draft an email pulling real facts from conversations you've actually had. A Google exec demoed exactly that during the keynote. And it landed. You access Spark in a couple of clever ways.

[00:09:30]You can literally email it. It gets its own Gmail address, or call it up from a new Android status bar feature called Halo. The community reaction is split right down the middle. Half the AI bloggers are calling it mind-blowing.

[00:09:43]The productivity dream of an always-on helper that already knows your context.

[00:09:47]Imagine saying, "Plan next Tuesday's team meeting." and having Spark check availability, draft the invite, write the agenda, and preload the relevant docs. The other half are nervous, for good reason. Giving any AI persistent access to your entire Google account is a massive trust ask. Privacy advocates are raising flags about what gets logged, what gets shared with training, and what happens when the agent inevitably misinterprets an instruction and emails the wrong person. Wired also pointed out Spark doesn't integrate with non-Google tools yet. Chrome support is coming later this summer. Spark is in closed testing right now with a limited release to AI Ultra subscribers next week. So, most of the reaction is speculative. Big vision, execution remains to be seen. Quick question, would you actually let an AI read all your emails? Drop your honest answer below because that question is going to define whether Spark wins or just becomes another impressive demo nobody trusts enough to use. The Gemini app, workspace, and daily brief. If you're wondering whether any of this affects the apps you already use, the answer is yes. The Gemini app got a facelift with a new interface called Neural Expressive. Vibrant animations, smoother voice chat that switches between speaking and typing without breaking flow. People like the fresh look, though it's mostly cosmetic. Same model in a prettier wrapper. What is new is daily brief. Personalized morning digest that pulls from your calendar and Gmail to summarize your day. Google Assistant on steroids, basically. Reaction is positive. The catch is you have to opt in and grant broad permissions. And some users are reasonably hesitant about handing over that much context.

[00:11:22]If you're already deep in the Google ecosystem, this is one of the lower friction features to try this week.

[00:11:27]Workspace is where things get bumpier.

[00:11:29]Google Pix is the new AI image generation and editing tool.

[00:11:33]Useful for mocking up visuals inside Docs and Slides.

[00:11:36]Demos looked great.

[00:11:38]User testing told a different story. One tester needed a portrait image and got back something they described as horrifying in every way. The face was just wrong. Faces are still the Achilles heel of almost every major image model in 2026.

[00:11:52]And Google Pix hasn't cracked it. Treat it as a brainstorming tool, not a finished asset tool. The other Workspace upgrade is voice Gmail, and Keep. You talk, it writes, it cites sources automatically.

[00:12:06]Accessibility win. The concern people are raising is that it'll promote shallow thinking. Fast drafts, nobody proofreads. Same critique people had about auto complete. And it cuts both ways. One frustrating note worth flagging. A blogger went looking for the promised calendar integration in AI Studio that was demoed on stage, and it wasn't actually live yet. That gap between what was announced and what shipped is a recurring complaint across the entire launch. Keep that in mind for everything else. Universal Cart, AI I/O Wear, and AI mode in search. Beyond the headliners, Google packed a lightning round of smaller announcements into the keynote. Universal Cart is the agentic shopping play.

[00:12:42]Browse products across the web, add them to one universal Google Cart, get AI alerts on price drops, better deals at other retailers, one-click checkout.

[00:12:51]Cool in theory.

[00:12:53]The skepticism is that Google's ad business benefits from keeping you in its ecosystem. So, best deal recommendations might not always be the best deal for you. Worth watching, but verify before you trust. Then there's intelligent I/O Wear, new Android XR glasses that do live translation, navigation overlays, and AR without needing your phone tethered. Ships this fall. Forum reaction was muted, and the reason is interesting. Wearable AI vision is cool, but the question every commenter asked was the same. Will the glasses look normal enough to wear in public? Style matters more than specs here, and Google has historically not nailed the consumer hardware aesthetic question. And AI mode in search, a year after launch, over 100 million Americans have tried it. The update adds generative UI inside the search box. AI generated infographics, summarized articles with charts, embedded videos answering your question directly. Tech fans love it. Journalists are less thrilled. Google's AI overview sitting above news links is going to gut publisher traffic. That fight is just getting started. What the experts are saying. The bigger conversation across the AI community is sharper than the marketing. On the bullish side, one AI influencer summed it up in two words, Google is back. Demis Hassabis tweeted the demos showed mind-blowing potential for developer productivity. Multiple analysts called the benchmark scores on flash insane for a flash model. Meaning the lightweight tier is outperforming what last year's flagship could do. On the cautious side, Sam Altman weighed in with a sharp observation. Google's bet is on agents over chatbots. And his take was, we'll see how much users care.

[00:14:25]Translation, he doesn't think Google has read the user demand correctly. Replies pushed back, but it's a fair question.

[00:14:31]ML researchers on X flagged a more technical concern. Flash's pricing blurs the line between flash and pro tiers.

[00:14:38]One thread put it plainly, flash is becoming more expensive and absorbing pro territory. That's a confusion tax Google will have to address. And the most upvoted comment on r/machinlearning was probably the most useful sentence in the entire conversation. Don't overhype this. Wait for more data. That captures where the community has landed.

[00:14:55]Cautiously optimistic. Impressed by the tech, especially video and agents. But carrying a real show me, don't tell me energy. Final takes and what to do next.

[00:15:06]Pulling it all together, the picture is pretty clear once you stand back from it. Omni is the win, flash is the controversy, spark is the wild card, and everything else, workspace, universal cart, the glasses, AI mode in search, is rolling out at varying speeds with varying degrees of polish. Some of it works today, some of it's still vaporware in a keynote slide. So, here's the actual game plan. If you're a developer, this is the week to test 3.5 Flash inside antigravity on a real focused task. Compare it to whatever you're using, watch your token spend, decide based on your own workflow, not the keynote hype. If you're a content creator, drop everything and play with Gemini Omni. It's the most impressive video tool Google has shipped, and you can use it right now in the Gemini app and Google Flow if you're a paid subscriber. Rare case where the demo and the reality are close enough to recommend without a million caveats. If you're a regular user, turn on daily brief if you live in Gmail, and keep an eye on AI mode in search. It'll change how you find information whether you opt in or not. The honest framing, don't believe the hype, but don't believe the doubts, either. Focus on the tools that work now. Antigravity for focused dev work, Omni for video.

[00:16:17]Everything else, give it a few weeks and let other people find the bugs first.

[00:16:21]Stay tuned for 3.5 Pro and Spark's full launch. That's when the next chapter starts. I want to hear from you. What are you most excited about? Omni, Spark, or something I covered?

[00:16:31]If you've already tried 3.5 Flash, drop your honest take below. Did it work, or are you reverting to an older model? If this breakdown saved you a few hours of digging, hit subscribe. I'm doing hands-on tests of each of these tools in the next videos. Real workflows, real bugs, real verdicts. See you in the next one.

#google io 2026 #google io 2026 recap #google io 2026 highlights #gemini 3.5 flash #gemini 3.5 flash review

Related Videos

Artificial Intelligence

OpenHuman VS Hermes AI: Who Wins?

JulianGoldieSEO

285 views•2026-05-29

Artificial Intelligence

Long-Running Agents — Build an Agent That Never Forgets with Google ADK

suryakunju

142 views•2026-05-30

Artificial Intelligence

This computer is made from real human brain cells. And you can buy it.

Talktmsmedia

3K views•2026-05-28

Artificial Intelligence

BREAKING: Microsoft’s New Image Generating Model Beat Out GPT 1.5 and Nano Banana 2

aimmediahouse

122 views•2026-06-03

Artificial Intelligence

I Made the Same Anime Fight Scene in Every AI Video Generator

NobleGooseAnime

295 views•2026-05-30

Artificial Intelligence

Nvidia Bets Big On AI PCs | New Chip To Power Windows Laptops | Technology | AI Updates | N18S

cnnnews18

3K views•2026-06-01

Artificial Intelligence

I Tested NEW Opus 4.8 on Four Projects (Updated LLM Leaderboard)

AICodingDaily

298 views•2026-05-29

Artificial Intelligence

3D Platformer Update - NO CAPES

SolarLune

294 views•2026-05-30

Trending

Computer Science

The Meta AI Hack Is a DISASTER

LowLevelTV

141K views•2026-06-03

Paris is in SHAMBLES right now 😭

H1T1

4053K views•2026-05-31

The Casino Had Us Guessing All Day

VegasMatt

157K views•2026-06-03

The Dancing Plague...

HoodieGuyStories

1730K views•2026-05-30