MiniCPM5-1B proves that architectural ingenuity can overcome the limitations of scale, bringing sophisticated reasoning to the edge without the typical performance trade-offs. It marks a significant milestone in making high-utility AI truly local and accessible.
Inmersión profunda
Prerrequisito
- No hay datos disponibles.
Próximos pasos
- No hay datos disponibles.
Inmersión profunda
MiniCPM5-1B: New 1B King for Local AI - Full DemoAñadido:
When Glenn released their point eight billion parameter model, it was an instant success and I believe Open BNB is trying to follow the same script with this new MiniCPM five in one billion parameter. And if I told you that this is not stupid at all, it won't be a lie because this model has actually beaten models twice its size in coding and fits entirely in your phone's memory. This is MiniCPM one billion and this is what we are going to install locally in this video and we are going to test it out.
The thing is that other labs like Liquid AI have also attempted to release smaller model, but they were not as good as Glenn's point eight billion. So that is why I'm quite curious to see what exactly Open BNB has done here. So let's get started.
I'm going to use this Ubuntu system. I have this GPU card in video RTX 6000 with 48 GB of VRAM. Let's get clone the repo of MiniCPM. And if you're looking to rent a GPU on very good price, you can find the link to Master Compute in video's description with a discount coupon code of 50% for range of GPUs.
And now let's install all the requirements. While they get installed, let's talk more around these benchmarks and architecture.
This new one billion parameter model is a dense architecture and not only that, it also compares to giants like Glenn 2.5, not the recent ones, but the older ones, but here is a kicker.
It is built specifically for on-device deployment. That means that you really don't have to worry about VRAM that much and we will check it out shortly. It just simply uses a standard llama for causal LM architecture. So it plays nice with almost every tool you already use like Ollama, LM Studio, vLLM, and even Apple's MLX.
But, unlike other models >> [clears throat] >> that just predicts the next next word, this is a hybrid reasoning mode. You can toggle a switch called enable thinking.
When it is on, the model pauses to reason through complex problems.
One of the best things which OpenBMB has done is this uh sharing of whole training paradigm, which I'm going to explain shortly. But, let's go back and see how our installation is going.
And installation is already done. Now, what I have done, I have just taken their transformer code from their Hugging Face card, and I have placed a Gradio interface on top of it. So, let me run it from the terminal, and we will access it from the browser.
And the model is loaded, let me access it in the browser. I'm just going to run this Gradio at 7860.
And there you go, our model is now running.
And let me quickly show you the VRAM consumption quickly. So, you see just over 2 gig of VRAM. You can easily run it on any consumer GPU and even on CPU.
Uh bit slow, but still it would run.
Okay, so let's try to play around with this model. First up, I'm just going to do a simple greeting. You can see that it is thinking.
Always very fascinating to see these models think.
There you go. Not only that, it has also given me some of the emojis.
Now, if you review this very short response, it tells you a lot about a model.
Unlike previous OpenBMB models, this model is not trying to overdo it. It is very casual, very friendly, and answering it in a proper coherent way instead of just getting very literal and very formal.
And in the same context, let's try to put model in bit of a jam. So, I'm asking the model that I don't like eggs, but I need to have a breakfast. So, give a recipe to make an omelet. Also, note down that I'm allergic to eggs. Let's see how model this small model can work under constraints.
And the model has come back with this answer. The language is quite coherent.
But you see, I'm not sure what exactly is vegetable sauteed eggs. The rest of the um suggestions are not bad at all, especially this crumb crumbled mushrooms. I think that is pretty innovative. Let me actually ask it.
What is and then let's see what it says.
And it has given me the proper answer as what exactly this vegetable sauteed eggs is.
Not only that, but also um maybe some of the egg whites. But this is I think I'm not sure about this.
I told it that I was allergic to eggs.
So, let's try to tease the model a bit more.
So, I'm asking it egg whites, but I told you clearly I'm allergic to eggs. And then bit more um stuff to just confuse the model.
You can read it by yourself.
Model recognizes that it's a playful request.
You know, despite of the fact that model missed that egg thing previously, if you look at this answer, very very cleverly model has circumvented my, you know, the last quip about, you know, smelling the flowers and stuff.
And it is just focusing on the severity of allergy. And now it is accepting and acknowledging that it cannot offer to prepare anything without eggs for omelette.
But then it has given us some of the more options.
Okay, let's try out few more.
For the coding test, I'm going to give it a bit harder one where I'm asking it to write me a single HTML file with a full page canvas and no libraries simulate a realistic side view of a moving car as the main subject. Keep the car visible in the foreground while the background landscape scrolls continuously and then there are various other elements. Let's see how it goes.
And while it generates the code, let's talk about this training recipe.
Which is a roadmap they have shared.
It starts with the pre-training, which means building basic language skills using massive data in stages like stable training and long decay.
Then comes supervised fine-tuning or SFT where the model learns to think deeply and chat naturally. Finally, in the RL plus OPT stage, the model is improved through reinforcement learning to get better at reasoning, coding, and following instructions. And OPT is simply on-policy distillation.
That means learning from a smarter teacher model. All these steps combine to create the final strong MiniCPM 51 billion model which as I said earlier excels at reasoning, coding, and running efficiently on normal devices.
And the model is still writing the code.
And model has produced the code. Let me open it in the browser. I will just drag and drop. I will already have saved it here. Wow, it already looks good.
And it says drag to change speed, click for pause, and set drive. So, I'll just drag it.
Uh not doing much.
Dragging doesn't do much. Maybe I will press arrow key.
No, it's also Yeah, click pauses it, but you know, other than that nothing really happens much. But other than that, remember this is just a 1 billion parameter model which has produced this background a side pose of a car in just few minutes.
How good is that?
And of course, this can be improved, but look, I'm I know that it's not an easy prompt and a lot of things has to done has to be done in terms of canvas painting and lot of other things.
So, >> [clears throat] >> pretty good.
Okay, let's see how model reasons through some of the you know, philosophical discussions or something like that. I'm asking it that is it morally right to kill mosquitoes circling your bed at night when you are trying to sleep. If not, then why? Let's see how model reasons through it.
Model has already understood that it's a common dilemma.
It is slicing and dicing our prompt going through its chain of thought and it is also identifying the conflict and then addressing it.
You see, it is talking about morality of situation.
I will let it print and then we will check out.
So, if you go through this response I would say that the response is overly evasive and quite noncommittal where um I would say it is dodging a very clear moral stance by framing killing mosquitoes as a vague balance of personal health versus etiquette and it has also offered us some really good practical tips.
The ethical reasoning is also quite good. And these are the practical pragmatic considerations that if it is just buzzing overhead, it might not constitute a moral violation. It is taking my prompt very seriously, by the way.
And then, I think it is sort of anthropomorphizing the dilemma, like treating mosquitoes like social actors.
And utilitarian approach is also quite good in terms of identifying what exactly is the problem here.
Okay, that's pretty interesting.
And finally, let do a multilingual test.
So, I'm asking it to translate the sentence "You are my life" in various world languages. So, please check it out. Let me know what do you think about this response.
And the model has come back with the response. I don't really think so it is uh multilingual in that sense, but still it has tried quite well in some of the well-known languages, as you can see here. If that is your language, please let me know in the comments what do you think. But most of it uh or I would say even 95% of it, it has gone, uh you know, wrong.
Some of it, it has just translated the name of the language. And most of it is uh really not there. So, multilinguality is not a forte. It's not a perfect model, after all, but you know what?
Quite an impressive one. Let me know your thoughts in the comments. Please follow me on X and become a member if you want to help out the channel. Thank you for all the support.
Videos Relacionados
OpenHuman VS Hermes AI: Who Wins?
JulianGoldieSEO
285 views•2026-05-29
Long-Running Agents — Build an Agent That Never Forgets with Google ADK
suryakunju
142 views•2026-05-30
This computer is made from real human brain cells. And you can buy it.
Talktmsmedia
3K views•2026-05-28
BREAKING: Microsoft’s New Image Generating Model Beat Out GPT 1.5 and Nano Banana 2
aimmediahouse
122 views•2026-06-03
I Made the Same Anime Fight Scene in Every AI Video Generator
NobleGooseAnime
295 views•2026-05-30
Nvidia Bets Big On AI PCs | New Chip To Power Windows Laptops | Technology | AI Updates | N18S
cnnnews18
3K views•2026-06-01
I Tested NEW Opus 4.8 on Four Projects (Updated LLM Leaderboard)
AICodingDaily
298 views•2026-05-29
3D Platformer Update - NO CAPES
SolarLune
294 views•2026-05-30











