Install our extension to search inside any video instantly.

How Reducing an AI Execution Bottleneck Lowers Needed Power and Cooling | Sambanova
Added: 2026-05-19

146 views41:37ipXchangeOriginal Release: 2026-05-14

The Reconfigurable Dataflow Unit (RDU) uses a dataflow-based execution model combined with a three-tier memory structure to enable AI inference systems to map entire decoder phases of Large Language Models onto hardware, eliminating the kernel-by-kernel overhead that plagues traditional GPU architectures and resulting in faster, more energy-efficient inference.

[00:00:00]Hey, how's it going, man?

[00:00:01]>> I'm good. Can you tell me a little bit about what you're doing over here at SambaNova? Yeah, so SambaNova are building new chips for the AI agentic era.

[00:00:09]We do that by creating an entirely new proprietary silicon that's unlike any other architecture in the world. We call it the RDU, the reconfigurable data flow unit, and the main value proposition of it is that you're actually able to run your AI inference faster than you've ever been able to before while also consuming way less energy. And this will, I assume, reflect in lower thermals, as well.

[00:00:32]Yeah, exactly. So, you're able to kind of like deploy this in like a lot of like interesting and like more remote environments, like places that data centers haven't been able to go historically. And so, we're really hopeful that this is really going to open up a lot of like advantages and allow people to continue building like more and more complex applications while, hopefully, you know, saving the environment a little bit, too. Can you explain a little bit about how the RDU accomplishes these low power? Yeah, so basically, it runs on a data flow-based execution model.

[00:01:00]And also, we have a three-tier memory structure, which allows us to switch memory in and out of access. And the combination of those two things allows us to map the entire decoder phase of an LLM onto our system compared to a GPU, which has to execute kernel by kernel.

[00:01:13]Because we've eliminated that bottleneck and overhead, that's one of the reasons our systems run so fast and efficiently.

[00:01:18]How would an engineer be able to get a hold of this to evaluate it that for themselves? Yeah, so you can learn more about us at sambanova.ai. We also have a cloud, which you can also visit from our website itself, that allows you to get an API key and start playing around with it right on your local environment.

[00:01:33]Thank you very much for showing us that.

[00:01:35]Of course, man. Hey, it's a pleasure.

[00:01:36]Thanks so much.

#semiconductor #electronics #electronics design #electronics engineering #semiconductors

Related Videos

Computer Science

Agentforce NOW AMA: Build with React and Salesforce Multi-Framework

SalesforceDevs

490 views•2026-05-28

Computer Science

How agent o11y differs from traditional o11y — Phil Hetzel, Braintrust

aiDotEngineer

450 views•2026-05-28

Computer Science

Re: 🗣️📍theprophedu📍2026 GST 103 CLASS (E-EXAM REVISION)

theprophedu

636 views•2026-06-04

Computer Science

WEB TECHNOLOGIES UNIT-2 | Degree 4th sem BCOM Computers web technologies unit-2 full explanation💯✅

LearnwithSahera

1K views•2026-05-29

Computer Science

More tests are always better? How to use AI to identify tests that bring little value

Alliance4Qualification

335 views•2026-05-29

Computer Science

Search Algorithms Explained in 60 Seconds! 🤖💨

samarthtuliofficial

218 views•2026-06-01

Computer Science

People of Game of Thrones using JavaScript DOM

AltCampus

296 views•2026-05-30

Computer Science

Instagram accounts got PWNed

EricParker

13K views•2026-06-03

Trending

Why Batman Lets The Joker Live 🤨

zackdfilms

9222K views•2026-05-30

Computer Science

The Meta AI Hack Is a DISASTER

LowLevelTV

141K views•2026-06-03

Paris is in SHAMBLES right now 😭

H1T1

4053K views•2026-05-31

The Casino Had Us Guessing All Day

VegasMatt

157K views•2026-06-03