Large language models like Nemotron 3 ULTRA (550 billion parameters) utilize hybrid architectures combining transformer and Mamba components to efficiently handle long contexts (1 million tokens) while maintaining performance. These models excel at agentic workflows, creative roleplay, and general knowledge tasks but may struggle with complex coding tasks from scratch. The model demonstrates strong capabilities in creating detailed scenes, implementing UI elements, and performing niche technical queries, though it requires iterative debugging for complex applications. Open-source models with accessible training recipes enable researchers to understand and improve upon model architectures, making them valuable for enterprise applications requiring US-based AI solutions.
Deep Dive
Prerequisite Knowledge
- No data available.
Where to go next
- No data available.
Deep Dive
Nemotron 3 ULTRA First Look & Test – NVIDIA’s LARGEST Model Yet!Added:
actually looks like a walking corn dog as a humanoid/zombioid corn doggoid looking. Today we're going to be testing Nvidia's Neotron 3 Ultra, which is a new 550 billion parameter open-source model from Nvidia. Now, this is really exciting because not only is this the biggest Neotron, but this is just a very large performant open-source model, and it's always nice to have more of those. So, I'm really, really excited to test this model. And before we get into it, a couple of things. One, please do feel free to subscribe as I do want that 100K plaque. And two, I have early access to this model. This video is not sponsored, but I do have access to it and they're serving it from their services. So, this is obviously going to be released open source publicly. But because of that, I don't necessarily have a hugging face model card or an official release post to go off for the introduction of this video. So, because of that, the introduction may be a bit shorter than normal, but I think it's probably better because I'm very excited to test this model. I've genuinely done like one thing with this, and it was not at all even a test that we do on this channel. So, I don't really know what this model is going to be like, and I'm excited to test it. As we can see here, the first thing I did was just ask it model slug bro, which is where it would just give us the model slug, which usually refers to the specific identifier string for a model. And it did respond that it is Neotron 3 Ultra.
So, we know that is the model we're speaking with right here. This is just a vibecoded web chat interface that um Opus made me. So, with that, let's take a quick peek at some of the pertinent things about this model and then we'll get into some fun and light testing. So, I've put together this little Excaliraw presentation in lie of having like a hugging face model card or something just with some things I've remembered about knowing about this model as well as just a couple of photos pertaining to like its benchmark JPEGs and things of the sort. So for technical specs, this is a 550 billion parameter model. It is a mixture of experts model with 55 billion active. It also has the hybrid mamba transformer architecture. So hypothetically, not hypothetically, so over a longer context, its VRAM requirement for the longer context would not balloon massively as opposed to a model that would if it did not have that hybrid member transformer architecture.
In conjunction with that kind of did I spell trans? It does have a 1 million context length which is pretty cool and it is open weights and source. So the Neotron family of models they always have like the training recipe as well along with the model. So essentially they open source more than just the weights. They also open source the things that basically led up to the existence of said model. So we see some benchmark JPEGs right here. And I do like to see this is being compared to other openweight models which are well at least in the case of GLM51 and Kimmy K2.6 6 larger than this model. However, included here as well is Quen 3.5 397B.
And something we're going to notice with this model specifically and I think the Neotron models in general, at least from the three family, is they're not necessarily designed to be coding beasts. So, we can see on the coding score that we get right here, just in the one that is included, it lags behind GLM51 and Kimmy K2.6, six, which does make sense in some regard because those are larger, but it's also basically on par with 3.5 Quen 397B. So, we likely won't be getting like insanely cool 3JS games out of this. We'll probably be able to get something though, which is cool. But something these models are always more poised towards is basically like enterprise safe use. And what is reflected by that is just looking at this. So, how productive is the agent?
And then beyond that, like um which one the instruction following. So it's designed more for like workflow things and maybe not necessarily the types of tests that I like to predominantly run on this channel, but we'll work around that. We also have some other things. So this does have MTP, so it can generate tokens quickly. It is trained with that natively, I believe. Then also an interesting chart that we see right here is cost to task completion versus number or percent of tasks complete, which is actually kind of cool. It just shows like it's cheaper to run than these other comparable models at a general level performing in the same sphere as they do for specific tasks complete. And this again ties back into its capability or functionality as like being a good agent model. The next thing to do is hop into this vibecoded web interface and begin with our triedand-true browser OS test. So this is the browser OS test v2.5 where it needs to include two 3D applications, one of which is a GTA clone and then some other browser things including a special feature. Now again, this model is not specifically designed for coding, but because it does seem to be relatively on par with Quen 3.5 397B in some coding areas, these are all things that I would be comfortable running with that specific model. So therefore, I am comfortable running them with this model as well. All right, so that was 2,33 lines of code. And this model was going, "What the heck? Frames per second in the browser OS. How do you knock something like this out, which is sick?" And then mess up the icons for the apps.
I have to say, I'm already impressed.
Does it have a right click?
Yeah, it does have a right click. And it has a darn good one, too. Okay. WebOS 3.0 neon kernel. All right. That is the correct time. Workspace frames per second.
Let's check our start menu. Yep. And the application icons not being present are preventing this from perhaps being like one of the all-time great. So, all right, we'll work with we'll work around that. Let's just run through these apps one by one. No, I'm going to start with the GTA game.
Okay, we'll try terminal. Okay, so this does work. That was my fear is like there's some like funkiness here.
Help. Okay, we have quite a few. Can we launch an app from there? Okay, launch GTA.
Let's do Neo Fetch. Oh, wow.
I think that's a face theme. Cyberpunk digital rain. Okay, let's try that.
Matrix.
Close window to stop injecting Digital Rain.
All right, I don't see any right now, but it does. Oh, how do we close this?
Uh, that. Okay, so that's full screen.
That's closed. And it does go to the navbar. Uh, that was minimized, I meant to say. All right, let's check our file explorer. Okay, this this theme I really like this theme personally. Open workspace. Oh my, what the heck? This model seems to have some Oh, I I just want to This whole video is going to be testing 3JS games. All right. I don't know how to This is so frustrating because it's good, but also like some of the stuff doesn't work, which makes me quite quite enraged. We're gonna have open code crack away at fixing this browser OS.
All right, so I've given it the issues we're having, and I'm I've swapped it to plan mode. There, it's doing a good job of properly showcasing everything here.
It did a very, very big to-do list. It's asking us questions. Uh, I hate base 64.
You know what? I think this model can actually handle that. So, we'll do that.
Button icons minimal is fine. Confirm.
Okay. Sick. Good. It did do it. So, it initially had encountered a problem where the fixes it tried to make in open code just were not working. I pasted it back. The issues that were here in the developer tools and it fixed them very very quickly. So, now we do have icons that have properly association with their specific app. Let's now check our start menu. Very good. Okay. So, we did look at terminal and files and Notepad as well. Everything does look a bit better now, which I do like to see. New folder.
Good. And it does add it. Documents. We should still be able to make a new Good. Good. It's got a functional file system. Oh, okay. So, these are kind of hidden like the close and things. Let's just check the GTA game. That's so frustrating. It doesn't open. The games aren't seeming to work now because this is just the first result. I think I'm probably going to move on to the next test. Oh, we can move these. Oh, what the I don't know that I've ever seen that. Next up, we're going to try the beautiful subway scene test, which just needs to make a beautiful static subway scene that we can move around in, assuming this does get completed satisfactoryy. We will then turn this into an FPS game just using the map that it generates. If there are issues here, I may just go back into open code and have it both fix the issues and then generate the game in one shot. So, here is our beautiful subway scene result. I feel like this should have appeared now.
So, I'm going to check the developer console. I'm going to try to fix this the oldfashioned way where I just send it back to it in the chat interface here. So, the behavior here was kind of funny where it made changes which did somewhat work because that specific error is gone. But if I tried opening it the old way, it actually hard-coded in a specific error message here or instructions saying you're using this wrong. You need to do it this way. So, I did it that way and the error message goes away, which is interesting to see.
But unfortunately, the problem is nothing is actually appearing in terms of the assets and I don't believe there are any specific issues right now. Oh, wow. Okay. Well, I was 100% wrong there.
So, unfortunately, the subway result just never quite got there. So, I've put the instructions for this in a specific directory, and I'm going to say build this. it will understand to search for the instruction text file and then plan out an implementation for it.
Unfortunately, we got closer but still just not a 100% there. So, I really do think this model like with some additional training or coding specific focused tuning would be a monster with some of the coding capabilities. All right. 3JS. Yes. Platform browser. Yes.
Scope full detail. Yep. Confirm. All right. Oh, wow. Okay. So, it's going to really go all out with this.
I did update the context length and open code to properly reflect what this model is capable of. So, look at the large large large amount of file directory this is planning to do. This might take a while. Okay, so after far far too long, it is still unfortunately failing to get the subway test working. I'm basically telling it now every time there's a new error, I'm sick of copy pasting it. So you have two options.
Either find a way to read that the Chrome console logs and then keep fixing them or just say I can't do this and bail out gracefully. So we'll see what it does. So it did properly put together a pipeline so that it can test this script autonomously, find the errors from the console and then try to fix them. With that said, I'm not feeling very hopeful that this is actually going to get to a point where we get a functional playable result. But I'm going to allow it to run this for a while because it's interesting to see and it also is somewhat of a longer context test across my complicated code with a lot of different scripts.
It is now a couple days later and the model has been publicly released.
However, I have not touched this at all since just telling it to fix the subway station result or to create it from within open code. Now, I am happy to report after quite a bit of time, it did produce something that had one simple syntax error right here on line 163 in the main.js file, I it basically had just misnamed this specific loading piece right here. So, fixing that allowed it to actually open. I verified that it opened, looked at it for like.5 of a second, and was like, "Oh, cool."
So, that's exactly where we're going to pick back up is where we left off. So, here is our station. And it does actually work now. I haven't looked at it at all. Okay. Oh, wow. Um, so this is going to This is Oh, it turned it into an I wonder if we have any sound. So, the speakers here, obviously, there's some fairly blatant issues with the camera.
Okay, good. There are sounds. Um, let's just not touch anything so we can look at the scene. I'm going to say something I noticed is the ceiling is actually pretty well done model-wise.
There are floating particles and the pistol does have muzzle flash and things like that. So, I want to just try to Oh, okay. So, it's a bit inverted. I do see enemies and things. Look at them. Oh, they look like little androids. Like the Android operating system. Um, actually looks like a walking corn dog.
I want to get down there because I see a bunch of like Okay. Yep. So, we need to just not move the mouse at all.
Those are footsteps you hear perhaps at a rhythmic pace that is questionable, but and then like look, it put in gates and lights and things like this. This again. Oh, okay. Now we're in outer space.
I'll try to get to one of the enemies, but Okay. Well, more or less functional and very weird enemies. I've not seen that as a humanoid/zombioid/corn doggoid looking model. Next up, I just want to try the flight combat simulator game. I'm going to be testing this specific one just using this through open router. Now, to ensure that we have everything set properly, this is currently offered for free and Nvidia is the one serving the free offering. There is also a paid one, but it's being served by a third party provider. So, aside from not having to pay for it, one, I also just want to stick with this one because it's being served by Nvidia themselves as opposed to a thirdparty provider. So, here is our flight combat simulator result. Okay, I noticed first off the bat this has kind of a similar aesthetic to our webOS 3.0 neon kernel.
Now, it's frustrating because it doesn't 100% work. We can see elements of what would have been contained from within here. But this is where we start to see one of the limitations basically of the Neotron model in terms of its coding prowess with like a niche specific task like a 3JS web game. So the UI and everything like that looks competently done. The tactical radar we have up there is cool. Sadly none of the actual models are there. And because this was run just through open router, it's not going to be super easy to have it fixed as it would be with open code. However, I figure it's worth a shot if there are any fairly simple errors. I will give it the one main error that was being repeated there, and we'll see if we get something updated to be fixed. So, here's our hypothetically fixed plane game. Now, keep in mind, I only gave it the one error we saw. Very good. It did fix it. Now, yes, it still looks pretty weird, but I do actually see some ammunition effects. The mini map is working, showing the location of the enemies. Now, it seems we have um lost some form of I don't quite know, but this does show some promise because it did actually fix the issues we were having. And here's our propeller plane.
They're very dark and hard to see, but it does have more or less a plane shape to it. Hey, so we lost there. And the logic actually worked where if you lose, you go to the home screen after a two second black screen. Okay. And then this is the interceptor. Whoa. Was that like a Okay. I don't know what just happened, but And we keep losing immediately. But the important thing is it did actually fix this and give us something that's at least somewhat playable. Simultaneously, I went to the Nvidia website here where they do serve the model and I ran a simple 3D printer simulation test because it's always fun to take a look at what these end up giving us. So, let's take a peek. Okay, this UI is really good because look, it has I don't think I've ever seen a filament selection feature with the 3D printing sim. Now, unfortunately, the main issue we're noticing is it just doesn't work on first try. This could be a very good candidate for open code to fix the issues with this because this interests me. I do like the side panel UI of this.
Supposedly, it fixed the 3D printer sim result. Now, I don't believe it did just because I didn't give it the specific error that was causing us problems. So, it looked at this just by basically combing through the code and trying to find any more obvious syntax issues.
Because of that, I will just copy paste this one specific error and give that to it. That was very very quick. So, let's see. Awesome. Okay, we're getting some warnings, but I'm okay with that because we do actually have some improvement here. Unfortunately, we can't move the scene around to see, but I noticed some Zcrews and actually like shaft collars or whatever you'd want to call that, which is something true to life that 3D printers have on the Z motor with the Z screw. Let's just try star print and see if we get any movement that's discernible.
Okay, unfortunately not. There's not a lot of functionality here, but it did fix the issue that we gave it very quickly, I might add. So, that's good at least. So, this is the second time I've tried running this skate game, and it just keeps getting tool call errors.
This is the third time this has happened, so I'm just going to unfortunately pass on the C++ skate test. This is something I've not ever done before, and I don't know how difficult it is. I don't think it's hyper difficult. I'm telling it to make me a 3D chess game in Python, single player verse AI, because I want to just try some Python with it as well. I have just initiated this from within build mode through open code because after the skate game it just that was like I just want to get more results to check. Okay.
Sort of chess variant would you like?
I'm just going to go with all of the standard things. Pygame and OpenGL is fine. No. Um yeah, that's fine. Good. So it should build this for us. All right.
Well, this is probably where we got to in terms of this. it. It's not really showcasing the errors that I'm seeing here. Seems to run, but there's no visible window. All right. Yeah, I'm going to pause this for now. So, it did make a Python script that was at least able to be opened, and it was able to autonomously open it. Oddly, some of these specific errors weren't showing up here in open code. I want to just ask it a general knowledge query right here.
Now, this does not have web search access, so it's either going to answer this if it knows it intrinsically or it's not going to know the answer. Good.
This is the first, I believe, open- source model that I've seen actually correctly answer this where that is the specific big weakness of this vehicle, which is the cam shaft sprocket separates. Um, you weld the cam shaft sprocket to the cam and then like it stops that from happening. So, that's very wellnown. Everything else here I'm not really interested in because the one thing that I was specifically looking for is right here. And this did accurately have this. All right, I've now turned reasoning off and I'm just back to the vibecoded web chat interface. Let's just do a roleplay test. Surprisingly, these models have actually historically been quite all right at roleplay. Bell above the door chimes chimes. Worn brass sticky hinge.
Fluorescent lights hum overhead, bathing the shop in that particular shade of clinical beige. Somewhere in the back, a fan worse. Dust modes drift through the shafts of afternoon light cutting through the blinds. Steve's PC repair and upgrades. Established 2003. If it plugs in, I can fix it. If I don't, I know a guy. A handwritten sign taped to the counter reads, "No, I can't make your 2005 laptop run cyberpunk. Stop asking." Behind the counter sits a man in his late 40s. Gray streaking through dark hair at the temples. Reading glasses pushed up his forehead, forgotten. A static wrist strap dangles loose from his left wrist. He's hunched over an open tower case. Flashlight clenched between his teeth, probing something on a motherboard with a multimeter probe. He pulls the flashlight free, squints at the reading, mutters something about capacitors and planned obsolescence, glances up. The pen tucked between behind his ear leaves a blue smudge on his temple when he pushes his glasses back down. Yeah, what's the damage? This is a lot of descriptive scene.
Less dialogue, more description of the scene. Voices gravel and coffee. He sets the multimeter down on a stack of anti-static bags. wipes his hands on a shop rag that's seen better decades, leaving them somehow cleaner than before. Laptop, desktop, phone, printer.
A dry chuckle.
Printer. God help us both if it's a printer. Talk to me. What do you do to it? Okay, cut the act, Steve. I'm going to just wait. I won't like read through the whole thing. I mean, with the way this thing's been performing in coding, I might have a hard drive click somewhere in the back. Dying breath of a seagate barracuda. That's funny. Single word flat. He leans back in the creek.
Everything in green here is just description of the scene. There's actually very little dialogue here. You want the drive? The one from the Costello job? November. The one I didn't wipe when they paid me to. Not a question. His eyes bloodshot at the edges. Sharp as solder points. Track your face. Thought we agreed you'd never darken my door again after the Larsson thing. Said my discretion was worth triple my rate. Said I was a professional. A pause then fluorescent light flickers once. So which is it? You here to pay the rest or you here to make sure I stay quiet? He doesn't reach for anything. What the heck? Doesn't move toward the bat, leaning against the workbench or the 38 in the locked drawer under the register. Just watches. Wait.
Clock's ticking. Got a pickup at 4. Dell XPS with a melted VRM. Okay. I thought that said VRM. I was like, what? Lady needs it for his kids graduation photos.
A dry shrug. Talk. That's very interesting because often times I will say, "Cut the act, Steve. You know why I'm here. Like the job you pulled somewhere exotic like went south." And then it will start talking like this where like it's basically threatening.
This did it sands all of that. And this is the thing is like folks would think, okay, this is an Nvidia model. It's going to be very cut and dry and corporate business-like. These are very, very odd role players.
Now I've properly kind of prompted it into the next.
The shop goes quiet. The fan keeps worrying. Somewhere a capacitor winds its high thin note. Steve's fingers still on his crossed arms go very still.
A tendant in his jaw jumps. Moscow. He says it like he's tasting something spoiled. Like the word itself has flavor. Copper and ice and cheap vodka.
October. Tall man. Long breath through the nose. His eyes drift past your shoulder fixing on something only he can see. A hotel hallway. Fluorescent lights buzzing worse than these. The smell of boiled cabbage and cigarette smoke and something copper hot. Tall man stay down. I made him stay down. Two to the chest. one to the He stops, blinks. The memory recedes filed away. Who told you different? His voice drops lower. The gravel gone, replaced by something that cuts because the only people who knew that job. He counts on Call's fingers, thumb to index, middle ring. Me, the handler, the client, and the tall man's handler, who I also put down in a parking garage outside Kim. He leans forward now, elbows on the counter.
Static strap brushes the multimeter. So, either you're the client, a sharp shake of his hand, which you're not, or someone's been digging in graves that should stay buried. His gaze locks on yours. Who sent you? Really? And don't insult me with a name I can't use. Oh, I got to do this.
Stevie Slappice. And then whatever happens here, I have to read this. But to be honest with you, like the coding is not going anywhere in this test. So, we might as well look at some of the strengths of this model, which this is definitely one. The name hits the air between you. Steve's face doesn't move.
Not a flicker. Not a micro expression, but his left hand resting on the counter slowly unccurls. The pinky finger knits.
The ring, middle, index, thumb last, a fist, slow, deliberate. The fluorescent light flickers again. This time it stays dim for three full seconds before buzzing back to life. Stevie Slapp. He says it like he's pronouncing a death sentence or a prayer. Haven't heard that name in 12 years, 3 months, 4 days. His thumb rubs against the side of his index finger. A nervous habit or a trigger check. Only three people ever called me that. My mother, God rest her. My co and Kandar, a dry, humorless breath. Okay, I had to actually not read the rest of that because it it was just like it got too real. It didn't say anything really bad, but like some of the stuff that it mentioned happening would have kind of been in poor taste to read through in a video in my opinion. Make of that what you will. It was very, very good at roleplaying. And because I want to just try something additionally at least to this, I have told it to just make a website for the Steve PC repairman that was talked about in this specific roleplay session. Again, recall that this is not currently with reasoning enabled. So, it's just going to immediately start spinning the script out, but I figure it'll be interesting to see if it transposes this tone of this roleplay into this website. So, here's the website it made based off of that roleplay. Okay, you know what? I don't know what to say. If it plugs in, I can fix it. I don't know if this glitch effect is intentional. To some degree, it must be, but it's a bit, you know, no corporate scripts, no upshell flowchart, just a man multimeter in 20 years of knowing which capacitors lie.
Interesting. Diagnose my machine. Good.
View services.
It the the marquee tags here are just so weird. No, I can't make your laptop runs cyberpunk. This is actually from directly the roleplay interaction.
Current load next available. No active alerts. All systems operational. What I work on service manifest. How this works. Three steps. No surprises. You approved before I proceed. Recovery logs. Case files. Real machines. Real problems. Real outcomes. Bring it in.
Okay.
Behind the freight depot. Blue door.
Payment terms. This is a very, very well done footer. I'm going to say even the little icon right here. This is probably one of the cleaner footers I've seen before. Okay, so the it seems like some of the content may not be showing up because I saw it writing code and it had some more information right here about specific case files, even some customer testimonials almost. I'm not seeing those here, which is a little disappointing, but it did do something.
And this does show a more unique front-end style. I've not really seen this specific like just this right here before. And also the footer was very well done. All right. So, I've been trying to have it do a drum kit sim again, but the problem is the tool calls for writing through open code just keep failing. And honestly, um I think I'm going to wrap this video up now.
Overall, this model is a tough one to quantify because it's not good at doing the type of tests that I specifically commonly run on this channel. I find that when having to create something from scratch in terms of coding it, it's not super strong. That's not a strength of it. It does seem a little better at fixing existing bugs as it did do some fixes quickly for some specific things we tested. Though something I'm going to say just as different behavior that I've not seen before, it really does seem to like the cyberpunk almost glitch style aesthetic as we saw out of the gate from the browser OS here where this background was very good. This is probably one of the better specific browser OS backgrounds I've seen with the animation. Additionally, this stylistic like elements of the glitch went into the Steve's PC repair website here as well. There are inevitably going to be some tests that I've run that didn't make it into the final cut of this video because the results just were not really anything we could look at. It did do a 3D printer sim where we have the model of the printer visible. And there are some things I see here that I don't normally see, like the shaft collar on top of the motor that would actually hold it to the Zcrew that would spin it. things like that that it just throws in that I don't commonly see. But then the downside of that is getting it to make functional things is kind of a a labor. So that brings one to the thought well okay like where specifically does this model fit in in terms of like workflows and who uses this if it's not a coding model and things like that.
Basically this is a good model because it has a lot of general knowledge and it does seem good for agentic use cases that are not specifically coding. There are many many businesses out there who are not going to be able to use models that don't originate from the US. So this is going to fit in very well for those places because this is the strongest model from the US in terms of open source bar. And that has a place in business. It's just that it's not necessarily the type of testing or things that I showcase on this channel.
So it's harder to test a model like this because it doesn't 100% fit into the way I like to test. But nonetheless, it's always cool to try out new models and things. So, that is going to conclude our first look and test of Nvidia's Neotron 3 Ultra, a 550 billion parameter open-source model. And this is a truly open- source model, which is awesome to see. So, that's going to wrap it up. If you have any questions, please feel free to leave them in the comments. And thanks for watching.
Related Videos
BREAKING: Microsoft’s New Image Generating Model Beat Out GPT 1.5 and Nano Banana 2
aimmediahouse
122 views•2026-06-03
Are AI deceiving us? | Roman Yampolsky, Gleb Solomin #AI #science
shortsGlebSolomin
1K views•2026-06-02
Nvidia Bets Big On AI PCs | New Chip To Power Windows Laptops | Technology | AI Updates | N18S
cnnnews18
3K views•2026-06-01
AI Doesn't Create Bias — It Inherits It
UXEvolved
176 views•2026-06-01
Distributed Inference Challenges Explained #shorts
alexa_griffith
466 views•2026-05-31
[한글자막] OpenAI @ Replay 2026 | OpenAI는 Codex로 개발 방식을 어떻게 바꾸고 있을까요?
TechBridge-KR
1K views•2026-06-03
Starting & Test Driving JAKE'S Abandoned BUS from Subway Surfers | POV Restarting
RestartGaragePOV
4K views•2026-06-04
Building the Future of Voice-First Sovereign AI: Sarvam & NVIDIA
NVIDIA
3K views•2026-06-01











