Instala nuestra extensión para buscar dentro de cualquier video al instante

How Reducing an AI Execution Bottleneck Lowers Needed Power and Cooling | Sambanova
Añadido: 2026-05-19

146 vistas41:37ipXchangeLanzamiento original: 2026-05-14

The Reconfigurable Dataflow Unit (RDU) uses a dataflow-based execution model combined with a three-tier memory structure to enable AI inference systems to map entire decoder phases of Large Language Models onto hardware, eliminating the kernel-by-kernel overhead that plagues traditional GPU architectures and resulting in faster, more energy-efficient inference.

[00:00:00]Hey, how's it going, man?

[00:00:01]>> I'm good. Can you tell me a little bit about what you're doing over here at SambaNova? Yeah, so SambaNova are building new chips for the AI agentic era.

[00:00:09]We do that by creating an entirely new proprietary silicon that's unlike any other architecture in the world. We call it the RDU, the reconfigurable data flow unit, and the main value proposition of it is that you're actually able to run your AI inference faster than you've ever been able to before while also consuming way less energy. And this will, I assume, reflect in lower thermals, as well.

[00:00:32]Yeah, exactly. So, you're able to kind of like deploy this in like a lot of like interesting and like more remote environments, like places that data centers haven't been able to go historically. And so, we're really hopeful that this is really going to open up a lot of like advantages and allow people to continue building like more and more complex applications while, hopefully, you know, saving the environment a little bit, too. Can you explain a little bit about how the RDU accomplishes these low power? Yeah, so basically, it runs on a data flow-based execution model.

[00:01:00]And also, we have a three-tier memory structure, which allows us to switch memory in and out of access. And the combination of those two things allows us to map the entire decoder phase of an LLM onto our system compared to a GPU, which has to execute kernel by kernel.

[00:01:13]Because we've eliminated that bottleneck and overhead, that's one of the reasons our systems run so fast and efficiently.

[00:01:18]How would an engineer be able to get a hold of this to evaluate it that for themselves? Yeah, so you can learn more about us at sambanova.ai. We also have a cloud, which you can also visit from our website itself, that allows you to get an API key and start playing around with it right on your local environment.

[00:01:33]Thank you very much for showing us that.

[00:01:35]Of course, man. Hey, it's a pleasure.

[00:01:36]Thanks so much.

#semiconductor #electronics #electronics design #electronics engineering #semiconductors

Videos Relacionados

Re: 🗣️📍theprophedu📍2026 GST 103 CLASS (E-EXAM REVISION)

theprophedu

636 views•2026-06-04

WEB TECHNOLOGIES UNIT-2 | Degree 4th sem BCOM Computers web technologies unit-2 full explanation💯✅

LearnwithSahera

1K views•2026-05-29

More tests are always better? How to use AI to identify tests that bring little value

Alliance4Qualification

335 views•2026-05-29

Search Algorithms Explained in 60 Seconds! 🤖💨

samarthtuliofficial

218 views•2026-06-01

Making Minecraft Clone with C++ & Raylib

PecaCSLive

686 views•2026-06-04

People of Game of Thrones using JavaScript DOM

AltCampus

296 views•2026-05-30

Instagram accounts got PWNed

EricParker

13K views•2026-06-03

Introduction to Problem Solving Part - 1 | Lecture 1 | Intermediate DSA

ascensionix

107 views•2026-05-29

Tendencias

Why Batman Lets The Joker Live 🤨

zackdfilms

9222K views•2026-05-30

They're Complete Trash

penguinz0

558K views•2026-06-04

The Murder of Deputy Caleb Conley

MidwestSafety

810K views•2026-06-04

I Bought FAKE HopeScope Merch (and paid a subscriber to give it a makeover) | Hopeful Hauls

HangWithHopescope

158K views•2026-06-04