The video provides a solid overview of the narrowing gap between open-source and proprietary models, though it leans heavily on speculative versioning to fuel AGI hype. It captures the industry's momentum well but mistakes rapid incremental progress for a fundamental breakthrough in general intelligence.
Deep Dive
Prerequisite Knowledge
- No data available.
Where to go next
- No data available.
Deep Dive
Kimi K2.6, GPT 5.5, Deepseek V4, Codex Superapp, Gemini 3.5, Grok 5 = AGI, & More! Huge AI NEWS!Added:
We've just came off one of the most intense stretches in AI news. And honestly, this next week looks even crazier because today alone brought some massive new developments. Moonshot AI dropped Kimmy K 2.6, an advanced open-source coding model that already is being compared to Opus 4.5 and Opus 4.6, which is kind of wild considering it's open source. At the same time, GPT 5.5, aka Spud, is reportedly right around the corner, possibly dropping today or even Thursday. And the leak generations people have been sharing look quite impressive, outperforming models like Opus 4.7 and Gemini 3.1 Pro in a lot of cases. On the Google side, we are starting to see new Gemini checkpoints surface, clear literally gearing up for announcements at the Google IO conference. Meanwhile, Coin 3.6 6 Max has officially been released. Codex is evolving into more of a super app and that's barely scratching the surface of what's happening right now. So, let's dive quickly into all these news topics.
There's a lot of stuff in the AI space that I don't really put on to the YouTube channel and you can actually access it through my free newsletter with the link in the description below where you can subscribe completely for free. To start off, let's take a look at Kim K 2.6, six, a new open-source coding model that is delivering state-of-the-art results across benchmarks like Swaybench, browser comp, and advanced math and vision task. You can see that it is something that is competing up against proprietary models while still being an open- source model.
Some cases it is comparing it to the 4.6, six, which is nice to see. But still, the fact that this model is basically on par or a little behind just by a little bit to these proprietary models is insane. This model introduces major upgrades like long running coding sessions with 12-hour plus sessions, 4,000 plus tool calls, plus massive agent swarms with 300 parallel agents.
It also has strong multi- language, multi-file development from a single prompt. And on top of that, it is extremely efficient. About 94% cheaper on input and 95% cheaper on output compared to Opus 4.6 while still outperforming it on the Swaybench Pro.
For example, the Kim K 2.6 can design and execute complex multi-step workflows end to end like building full quantitative strategies across hundreds of assets. And when it comes to front end, this model is exceptional because it's able to use tools. And you can see that it's able to process and develop beautiful landing pages with different attributes, dynamic movements, various sorts of typographies. And this isn't possible with proprietary models. And this isn't something that you would expect from an open- source model. Now, I know the speed may be atrocious, but you can essentially run the Kim K 2.6, six a full one trillion parameter VLM locally using MLX on dual M3 Ultra and this is something you can see that is able to process any sort of generation on this device. Next, just yesterday I had made a video on the GPT 5.5 aka Spud which is rumored to be open AI's next major model upgrade. Currently, it is being AB tested inside chat GBT, but we had taken a look at early demos and we tested it out and we noticed that it is incredible in terms of its speed, token efficiency, and reasoning with faster outputs and stronger performance on complex tasks. It's especially impressive in areas like coding, SVG generation, game creation, and even 3D workflows using tools like the GS. The model is something that also stands out in terms of its creativity. goes beyond prompts to add structure, detail, and better design direction on its own.
Overall, it feels like a big step forward that is something that is more complete, agent-like you can say, and not just generating code, but building full experiences. The best way to think about it is that it's a halfway point to GPT6, combining better reasoning, faster performance, and lower cost into one model.
Today I actually had created an Excel clone using the GPT 5.5. It is something that is not just something that was cloned visually, but it actually feels real cuz you have grid behaviors, you have formatting interactions, and it is scarily close to the real thing. This is something that I didn't really expect from the GBT 5.5. And I gotta say the front-end development for this model is definitely something that you would want to look out for because if it is token efficient plus it is something that is easily and readily accessible compared to the Opus 4.7 then this may be the option that you would want to use for coding related task over the Opus. And not to get your hopes up but this is something that I had personally saw on Poly Market right now. And this is where they are suspecting a lot of people are suspecting that the GPT 5.5 could be released today or the latest at the very latest Thursday cuz these are the two days usually open AI tends to drop their models and there are a lot of different signs pointing at Open AI releasing a model this week. Take this with a grain of salt. But next up is a claim coming from Zank, a Princeton PhD researcher and Princeton AI lab fellow that is working on large language model reasoning. According to these reports, DeepSeek version 4 could be releasing this week and it is something that features a massive architecture design that includes sparse MQA fused kernels and hyperconnections. It is rumored to be a 1.6 6 trillion parameter model that is positioned to compete directly with models like Opus 4.7 as well as the new GPT 5.5. Early leaks are suggesting extreme performance levels including MMLU around 99.4%age 4 percentage and Swaybench at 83.7 percentage. Though these numbers still are unverified obviously and because of the model scale only heavily quantized versions would realistically run locally potentially requiring something like a 512 GB class machine as well as extensive hardware to actually run this model. And while all of this is still in the rumor stage, multiple leaks across forums and research circles suggest that we could be actually seeing the DeepSeek version 4 this week. Next is related to the Codeex app, which is something that Open AI recently had just introduced. It's no longer just a coding tool cuz it can now interact with apps on your Mac, connect to external tools, learn from your previous actions, remember how you like to actually work, and handle ongoing repeatable tasks. One of the biggest additions is computer use on Mac OS where Codex can actually see, click, and type using its own cursor. It runs in the background without taking over your system, which means that it can handle things like front-end iteration, app testing, and workflows that don't even have APIs. On top of that, it now suggests image generations with GBT image 1.5. By the way, we might even see GBT image 2 soon directly inside codecs.
But this is something that lets you create front-end mockups, game assets, and UI designs without leaving your workflow. And automations have been also upgraded, too. This task can now continue across sessions in the same thread. Codex can schedule work, pause, and then wake back up later with full context, whether it's finishing PRs or tracking ongoing projects. At this point, it is basically turning into a full super app for development and automation. And on top of that, they've also just released a research preview of something called Chronicle inside Codeex. This essentially allows Codeex to build memory from your day-to-day work on your computer. And then it uses those memories later to become completely and significantly more helpful and contextware over time to help you and further assist whenever it's needed. It's currently available for pro users on Mac only. And while it's still early and fairly tokenheavy, people inside Open AI are already saying that it has noticeably changed how they use codecs in daily workflows. Next is an XAI model that I truly find underrated. The Grock 4.3 beta, which is something that is exceptional at front end as well as coding related tasks.
This is where I had requested it to create a beautiful CSGO clone. And you can see that it actually created a fullon bazooka, which is incredible. But essentially, this is Xi's latest test model that is approximately 0.5 trillion parameters. It's a newer checkpoint. It has an improved architecture and in terms of knowledge cutoff, it has been trained up to December 2025. You can easily access it if you are on super gro heavy and it should be available for most users soon. The key upgrades are native multimodal with better visual understanding, agentic tool use, coding.
It is something that can even generate full docs, slides, PDFs, spreadsheets and it also reasons better with fewer hallucination.
Now I know that model is great and all but what's even more wild is what Elon had recently stated about the Grock road map. According to his comments, Grock 4.4 is expected to be a one trillion parameter model that is releasing early May, followed quickly by the Gro 4.5 at around 1.5 trillion parameters in late May. And then comes the biggest claim of all, Grock 5 being positioned as AGI.
Now, if even partially accurate, that would basically mean that we're looking at two major model releases, separating us from what he's calling AGI based off of his own timeline. And something to note is that we don't really know what Elon's definition of AGI is. So, just keep that in mind. Next up, Alibaba has quietly dropped a preview of its flagship model, the Quen 3.6 Max preview. It's an early look at their next generation system and the focus is clearly on stronger agentic coding capabilities compared to Quen 3.6 Plus along with better instruction following and improved realworld reasoning and knowledge reliability. In simple terms, it's designed to be smarter, more consistent in long horizon tasks and more capable as an autonomous agent in practical workflows. Now, unfortunately, this isn't open source, but hopefully we can get an open- source preview of this in the future. With the Google IO conference roughly 28 days away, the hype is definitely starting to build up again, and Google has begun testing newer Gemini checkpoints inside the AI studio. Early signals suggest we could be looking at something like the Gemini 3.2 Pro or even the Gemini 3.5 Pro, and it is something that is significantly upgraded. There are also hints that what's being tested might actually be a lighter flash type variant of the Gemini 3.1 series rather than a full flagship jump, but this is all speculation from what we're seeing. The quality doesn't look like a major leap yet, but it's still very early in the AB testing, so things could quickly change like we previously saw from the initial Gemini 3 release. Speaking of this new conference, the leaks are starting to surface and Google is reportly testing out a new co-work style competitor inside Gemini and Gemini Enterprise currently referred to as agent. This is something that is going to function exactly like co-work and it is something that will let you essentially automate different sorts of agentic tasks and it is something that exactly functions like co-work where you can delegate different goals, agent assignments, connect different sorts of applications and what I really think this would be useful for is having it work upon all the Google Workspace connections which is going to be able to allow you to automate Gmail, your spreadsheets, anything that is cloud-based and workspace related. Also, Google has now expanded access to your Google AI subscription where it is something that works inside Google AI studio. This means increased five coding limits and direct access to pro models without needing to link an API. So, in practice, if you have the pro or ultra plan, you can use extended usage within the Google AI studio. And to wrap it all up, we've even hit a point where robotics is starting to feel straight out of a sci-fi simulation powered by AI. There's now a full-fledged robot competing in a marathon, even outperforming humans in certain segments, which already feels a bit dystopian on its own. What's even more surreal is how it's engineered. The whole system, the movements, everything a part of it. It even functions like how a fullon F1 pit stop is simulated. And this is where humans step in quickly to service the robot, cool down the robot between runs, which is kind of funny.
And in some cases, they are even using dry ice for cooling, which is essentially turning it into the F1 of robots. This is something that was fun and I wanted to include it in today's video. If you like this video and would love to support the channel, you can consider donating to my channel through the super thanks option below. Or you can consider joining our private Discord where you can access multiple subscriptions to different AI tools for free on a monthly basis, plus daily AI news and exclusive content, plus a lot more. But that's about it, guys, for today's video. I hope you found this video to be informative and that you were able to get some sort of value out of it. Make sure you go ahead and take a look at the second channel. This is where we're also constantly posting AI on a daily basis, as well as our newsletter, the Discord, as well as following me on Twitter. And lastly, make sure you guys subscribe, turn on notification bell, like this video, and please take a look at our previous videos so that you can stay up to date with the latest AI news. But with that thought, guys, thank you guys so much for watching. Have an amazing day.
Spread out positivity, and I'll see you guys really shortly. He suffers.
Related Videos
OpenHuman VS Hermes AI: Who Wins?
JulianGoldieSEO
285 views•2026-05-29
Long-Running Agents — Build an Agent That Never Forgets with Google ADK
suryakunju
142 views•2026-05-30
This computer is made from real human brain cells. And you can buy it.
Talktmsmedia
3K views•2026-05-28
BREAKING: Microsoft’s New Image Generating Model Beat Out GPT 1.5 and Nano Banana 2
aimmediahouse
122 views•2026-06-03
I Made the Same Anime Fight Scene in Every AI Video Generator
NobleGooseAnime
295 views•2026-05-30
Nvidia Bets Big On AI PCs | New Chip To Power Windows Laptops | Technology | AI Updates | N18S
cnnnews18
3K views•2026-06-01
I Tested NEW Opus 4.8 on Four Projects (Updated LLM Leaderboard)
AICodingDaily
298 views•2026-05-29
3D Platformer Update - NO CAPES
SolarLune
294 views•2026-05-30











