Cerebras' wafer-scale engine (WSE-3) addresses the memory wall bottleneck in AI chips by integrating 44GB of memory directly on a single 46,225 square millimeter silicon chip, achieving 21 petabytes per second of on-chip memory bandwidth—7,000 times greater than Nvidia's H100—while maintaining 93% active silicon through microscopic core design that tolerates manufacturing defects.
Inmersión profunda
Prerrequisito
- No hay datos disponibles.
Próximos pasos
- No hay datos disponibles.
Inmersión profunda
This 900,000 Cores & 3-Billion Transistor AI Chip Just Made Nvidia’s AI GPUs Look Like a JOKE!Añadido:
Nvidia makes the chips that power basically every AI you've ever used.
ChatGPT, the image generators, virtually all of it. For years, they had zero real competition. If you wanted to build serious AI, you bought Nvidia, you bought thousands of them, and you paid whatever Jensen Huang felt like charging in the morning. That's how you build a multi-trillion-dollar company. And then a company out of Sunnyvale, California, looked at how literally everyone makes AI chips and said, "Yeah, we're not doing any of that." What they build is so strange, so completely oversized, that the first time I saw the specs, I thought it was a typo. It's a single chip the size of a dinner plate, and it's already beating Nvidia's best hardware at the one thing that matters most. I'm talking about the Cerebras and their wafer-scale engine, and I think it's one of the most exciting things happening in tech right now. So, let me show you exactly why Nvidia should be a little nervous.
Here's the thing most people get wrong about AI chips. Everyone obsesses over raw compute power. How many calculations per second, how many cores, whatnot. But that's not actually the thing slowing AI down. The real villain is something engineers call the memory wall. Let me break it down simply. In a normal computer, in a normal GPU like Nvidia's, you've got two separate things. You got the part that does the math, and you've got the memory where the data lives. And these two are physically apart from each other. So, when your chip wants to do a calculation, it has to go fetch the data from memory, drag it over to compute cores, do the math, and then ship it back. Sounds fine, right? Except modern AI models have hundreds of billions, maybe even trillions of parameters, and the chip is constantly hauling all that data back and forth billions of times.
The compute cores end up just sitting there tapping their feet waiting for the data to show up, and you've got this incredibly fast engine, and it spends half its time idling because the fuel line is too skinny. That's the memory wall, and honestly, no amount of just adding more cores will ever fix it. You can build the fastest race car on Earth, but if it's stuck in traffic, who cares?
The traffic jam is what's being quietly limiting AI for years.
So, here's where Cerebras does something completely nuts. Every other chip company, Nvidia included, starts with a big round silicon wafer, then cuts it up into dozens of small chips. An Nvidia H100 is around 826 square millimeters. That's roughly the size of a postage stamp. They cut it small on purpose because the bigger a chip gets, the more likely it picks up a manufacturing defect, and just one defect normally kills the whole thing.
Cerebras looked at it and said, "What if we just don't cut it? What if the entire wafer is the chip?" And that's exactly what they did. Their latest one, the WSE-3, is 46,225 square millimeters of silicon. That's roughly 57 times bigger than an H100. We're talking a piece of silicon roughly the size of a dinner plate. Around 21 and 1/2 cm on each side. The numbers in this thing are genuinely ridiculous. 4 trillion transistors, 900,000 AI cores, and for comparison, the H100 has just under 17,000 cores. So, the Cerebras chip has around 52 two more. It does 125 petaflops of PKI performance and it's built on TSMC's 5 nanometer process. But, here's the part that actually matters. The whole reason they went this big. Remember the memory wall, the traffic jam between compute and memory? Cerebras basically eliminated the highway because everything is in one giant piece of silicon. They put the memory and the compute right next to each other on the same chip. There's 44 GB of memory living directly on the wafer.
Now, look at the bandwidth, which is the speed data that can actually move around. The WSC-3 has 21 petabytes per second of on-chip memory bandwidth. According to Cerebras' own numbers, that's about 7,000 times more than an Nvidia's H100. 7,000.
That's not a small improvement. That's a different category of machine. And the cores talk to each other through the wafer with single clock cycle latency, basically instant with no need to route the signals off the chip and onto some slow external connection. The data has almost nowhere to travel. It's already there. So, that idle problem I mentioned previously where the cores sit around waiting for the data, on a chip like this, it mostly goes away and you can really see it in the real world. As of late 2025, Cerebras was running Meta's Llama 4 Maverick model. That's a 400 billion parameter model at around 2,500 tokens per second per user. That's reportedly more than double what Nvidia's flagship Blackwell B200 system does on the exact same model. When the bottleneck disappears, the speed shows up. I love this approach. Honestly, it's kind of Wait, are we even allowed to do that engineering that I find thrilling.
Everyone else was optimizing the traffic. Cerebras just built a city where nobody has to commute. Now, you're probably thinking what I thought. If big chips fail because of defects, how on Earth do you build the biggest chip ever made and have it actually work? That's the clever part and it's why Cerebras pulled off something people in the industry thought was 100 years away. A chip this size will pick up around 46 defects during manufacturing. On a normal chip, any single defect is a death sentence. So, Cerebras made their cores absolutely tiny. Each core is about 0.05 square millimeters, roughly 100 the size of an Nvidia score. They build around 970,000 of them. So, when a defect hits, it doesn't kill the chip. It kills one microscopic core and the chip just routes around it using the redundant cores they built in. They've said they keep around 93% of the silicon active, which is actually better than what a lot of GPUs manage. They didn't make defects disappear, they made each defect cheap.
That's a beautiful piece of problem solving. They also had to invent custom machinery just to assemble the thing, building a special connection material so that wafers wouldn't crack from heat expansion and figure out how to cool a single chip that pulls around 23 kilowatts of power. That's the power of several houses running through one piece of silicon. All right, I'm not going to sit here and pretend this is a flawless Nvidia killer because it isn't and the honest picture is much more interesting anyway. Building these is hard and expensive. Estimates put a single WSE chip somewhere in the two to three million dollar range. You're not putting that in a gaming PC except if you're maybe Jeff Bezos. The packaging, the cooling, the custom power delivery, it's all bleeding edge and it's complicated and complicated things break in ways simple things don't. Then there's the flexibility part. Nvidia's real mode isn't just a hardware, it's CUDA, the software ecosystem that millions of developers already know inside and out.
Cerebras has its own stack and while it's solid, the whole world isn't trained on it. That's a massive head start for Nvidia that chips alone don't race and scaling has a trade-off, too.
When one model is a bigger than a single wafer can hold, you have to link multiple systems together and that brings back some of the very complexity the single chip was meant to avoid. So, it definitely isn't game over for Nvidia, not even close. But is Nvidia actually in trouble? Here's my honest take and I want you to sit with it for one second. For years, there was no plan B. It was Nvidia or nothing. Every AI company on the planet lined up to pay the toll because there was no other road. That's the part that's changing right now because the customers are voting with their wallets. Cerebras signed a 20 billion dollar agreement with OpenAI for inference capacity.
Amazon dropped their system straight into its bedrock service and in April 2026, they filed to go public on the NASDAQ, reportedly chasing a valuation north of 22 billion dollars off a round a half a billion in revenue. They stopped being a science experiment a while ago. This is a real company taking real money on a real bet that the entire industry was building chips the wrong way. And that's the thing that gets me.
Nvidia isn't losing the crown tomorrow.
They're simply too rich for that.
They're too entrenched. And they're still ridiculously good at what they do.
But in trouble doesn't have to mean dethroned. It means for the first time in years, somebody walked up with a completely different idea, and it actually works. And the giants are paying attention. The most boring story in tech is one company winning forever.
And Cerebras just took that boring ending off the table. So, is Nvidia in trouble? Can Cerebras enter the race and become an actual competitor? You tell me.
Videos Relacionados
resume fixed instantly 😭 Comment “app”andI’ll sendyou the link #parakeetaipartnership #resumetips
Ritcareer
686 views•2026-05-31
Re: 🗣️📍theprophedu📍2026 GST 103 CLASS (E-EXAM REVISION)
theprophedu
636 views•2026-06-04
3D Basics in C
HirschDaniel
2K views•2026-06-05
Search Algorithms Explained in 60 Seconds! 🤖💨
samarthtuliofficial
218 views•2026-06-01
Making Minecraft Clone with C++ & Raylib
PecaCSLive
686 views•2026-06-04
People of Game of Thrones using JavaScript DOM
AltCampus
296 views•2026-05-30
Instagram accounts got PWNed
EricParker
13K views•2026-06-03
So What's Odin Lang Even Good For
TechOverTea
131 views•2026-06-01











