Install our extension to search inside any video instantly.

AI Models Ambiq Can Run at the Edge
Added: 2026-06-01

220 views113:50Ambiq_AIOriginal Release: 2026-05-30

Ultra-low power edge AI processing is achieved through sub-threshold silicon technology combined with optimized software runtimes, enabling multiple AI models to run simultaneously on resource-constrained devices like wearables while maintaining battery life of weeks; this approach reduces power consumption by orders of magnitude compared to traditional approaches, allowing applications such as health signal processing, speech enhancement, and AI-based data compression to operate efficiently on battery-powered devices.

[00:00:00][music] >> Hello the AI Exchange is it's your host Sandro here and if you look behind me you might recognize some of these devices you might even be wearing one.

[00:00:12]We're here today at Ambiq where they have made some of these possible. We're going [music] to be looking at some AI software development toolkits and AI models. So let's get into it.

[00:00:22]>> So I'm Dr. Adam Pike, I'm the head of AI at Ambiq. Um yeah, I got three demos set up today and I can kind of walk you through what they do, a little bit background about, you know, [music] what Ambiq's about, what we're going for, and what kind of sets us different from our competition.

[00:00:34]>> What would you say is the core of what you do here at Ambiq?

[00:00:38]>> Yeah, I mean the the core that underlies everything that we do is our SPOT technology. Um SPOT stands for sub-threshold power optimized technology. So it's kind of a platform.

[00:00:46]Um it's basically we run our silicon sub-threshold. So that maybe you've seen that before analog devices kind of do that same thing, but we do this at mass scale and that's really like difficult to do. Um and what that ultimately means is we get to run our voltages much lower and that translates to much lower power for our for our customers.

[00:01:03]Um and this is like order of magnitude better than what competition does. So we were founded in 2010 um on the SPOT platform. We started with real-time clocks and then we started our foray into microcontrollers. So Cortex-M4, Cortex-M55, and then we're going to getting now into like bigger kind of silicon that also has like NPUs. Um we're really not like a microcontroller anymore, it's a system on chip. We have GPUs, DSPs, NPUs. It's a really kind of heterogeneous system that we have and underpinning all of this is our sub-threshold technology.

[00:01:31]>> So that sub-threshold technology allows you to make these very low power devices.

[00:01:35]>> Absolutely.

[00:01:36]>> Uh so >> Yeah. So you'll see a lot of different devices you see where usually one of those either as the main processor as a as a sensor hub. Um that basically allows them to use our silicon as much as possible and that gives them all day battery, right? Weeks of battery, whatever application is. Smart rings, hearables, wearables, even smart glasses is another area that we're getting into.

[00:01:54]>> What part of the system do you fit into in let's say edge AI for example?

[00:02:00]>> Right, so it used to be we're more like a sensor hub, but we're actually becoming more of the central processing unit for a lot of these different devices. Um, so we're going to be doing everything externally. We have I squared C, we have I2S, we can do digital mics, analog microphones.

[00:02:13]Um, we support USB. We also have different wireless communication protocols that we support. Our most latest ones are going to support thread and matter.

[00:02:20]Um, so we're really transitioning from like a sensor hub like a co-processor to actually being the main processor for a lot of these wearable devices.

[00:02:27]>> Okay, so we have here some uh demos that you've kindly set up for us. Can you walk us through a little bit about what we're looking at and what kind of technology that allows you to show us these demos?

[00:02:39]>> Yeah, so all of these demos today are shown on our Apollo 510. That's Cortex M55 based. It also has the vector extension engine. Um, that's also what we're using to accelerate some of the AI workloads.

[00:02:49]Um, so obviously the core thing is the spot technology, but in addition to that is the software that optimally use that, right? So a lot of our customers today say, "Hey, we want to deploy models onto your silicon. We want to do that efficiently."

[00:02:59]It used to be they'd use like TensorFlow Lite for microcontrollers, some other open source um service. Um, but we've actually made our own head-of-time runtime called Helio AUT. We also have Helio RT. These are two runtimes that basically you bring your model and it optimally deploys onto our silicon.

[00:03:14]So it's going to take advantage of the MCU, it's going to take advantage of the MVE.

[00:03:18]Um, and so these are kind of showing not only can you run like a model, but some of these cases running like three models or four models at the same time.

[00:03:25]Um, which is quite interesting. So with this first one, um this is kind of towards remote patient monitoring. So it's kind of the models are built using our heart kit. Um, so something to help get customers accelerated is we have these AI development kits. It's software based. So you can kind of think like a model zoo, except it's just not like a static list of models. We actually give it like a playground. You can bring your own model, you can bring your own data set, you can bring your own task. We have pre-trained models that you can start from. You You customize those or even actually extend it to different kind of capabilities.

[00:03:53]>> So, an engineer today could use those services. Yep, they could test um >> Exactly. They can use one of our AI development kits to use one of those pre-trained models. They could create their own. And then we have tools that basically deploy it. So, today we actually support Zephyr. Um both Hele AOT and Hele ART support Zephyr as well.

[00:04:10]So, they're like Zephyr modules.

[00:04:12]Um so, you can get started like you do today with any kind of Zephyr based project. Um and you can pull in your models, deploy them efficiently, and you're going to see orders of magnitude better than what you're going to see from competition.

[00:04:22]Um So, what we do with these demos is kind of show them multiple things. We're showing how ultra low power it is. It's showing the fact that you can run not just one model but multiple models. Um these models were trained using our AI development kits as well.

[00:04:34]Um and we're kind of showing different domains. So, the far left where it's like heart kit based, so again it's more like remote patient monitoring. In the middle we have a sound kit based demo, so it's something like speech enhancement. Um really good for conversational UI where you want to like enhance the speech, teleco kind of communication. Um that's critical for that as well. And the thing we've done recently is actually um a new one, it's called compression kit. So, we have a lot of customers today, right, smart rings, fitness bands. While they want to do inference on the edge, they still that data is precious to them. So, they still want to transmit that to the cloud. Obviously transmitting stuff over BLE, Wi-Fi, that's power consuming. So, compression kit, it's AI based, it's compressing specifically like let's say you're doing PPG or ECG. It knows how to effectively compress that kind of modality. Um and we can get like like 16 32x compression. So, that means 32 less memory need, 32 less amount of data you need to transmit. And when you reconstruct it, you're going to get almost identically back what you started with.

[00:05:29]>> Okay, so allows you essentially to still keep that low power while being able to then also transmit that data.

[00:05:36]>> it kind of gives you different um capabilities. So, some customers today they don't want to trans transition completely to the edge. So, they can compress it, they can go to the cloud, and they can run their AI models there.

[00:05:45]They can even do deploy some and do like AB testing. So, they can say, "Hey, we're running this on the edge. We also want to go verify it in the cloud." The different kind of combinations that they can work with. Ultimately, it's saving the power, amount of external memory that they're going to need on their on their device.

[00:05:58]>> Okay. So, let's have a more in-depth look at some of these demos now. So, what exactly are we looking at with each >> Yeah. I'll start in the very far left.

[00:06:06]So, it's a it's a bit busy what's going on here, but really what we're showing is we're streaming ECG as well as PPG signals. And right now we're running three different AI models. So, you imagine an ECG signal as you're wearing it's going to be highly noisy. Um so, the first thing that we do is denoise the signal. Um and that's what we're kind of showing here with the model and the efficiency that it's gaining. The next thing you want to do is segmentation, right? If you're looking for heart attacks, you're looking for AFib, you're going to want to know where like the P, the QRS, and T wave is. So, we actually do segmentation, which you can kind of see here. It's identifying all the all the different parts to it.

[00:06:37]Um and the last thing that we're showing is arrhythmia classifier. So, this is doing AFib, atrial flutter, um bradycardia, tachycardia. It can detect these different kinds of arrhythmia events. All of this is running on on the device. So, three models running. We're showing the amount of CPU usage. So, even though the three models are running, it's only taking up 25% of the CPU. The core itself, um the M55, could last up to like 3 weeks um with battery on two coin cell batteries, which is, you know, impressive.

[00:07:02]Um and then we kind of show different kind of variabilities. We kind of show what the heart rate is, heart rate variability. We're showing how well it's denoising it. Um this is also something we partnered with ams. So, this is their click module. One of the nice thing about all of these demos is we've kind of gone with a micro e um click module form factor, which means you can use all those different sensors, bring them in, snap them on, and start evaluating it.

[00:07:25]>> Fantastic. And then this is your neural network of voice enhancing.

[00:07:31]>> Yeah. So, this is speech enhancement. Um and we have a few different models that we provide. Most of these today are open source. And we're also looking to have ones that may be even more capable in the future. Um and what we're showing here is, you know, I can run through these we kind of cycle through, but if I click let me see baby crying.

[00:07:46]It's always a a favorite.

[00:07:48]>> [laughter] >> So, I can pick like baby crying and what we're seeing is both the original signal as well as the um the cleaned up signal, right? So, you look at the spectrogram there's all this high frequency noise.

[00:07:58]Um, after we clean it you can see it gets rid of that. We can actually play it.

[00:08:01]>> regard for decency and decorum.

[00:08:03]It probably was conscious that Mrs. Rachel was sitting at her window.

[00:08:07]>> And then if we do the cleaned up version >> decency and decorum.

[00:08:11]It probably was conscious that Mrs. Rachel was sitting at her window.

[00:08:15]>> So, that's like one model that we have and in fact we have like all these different kinds of model variants. So, this is a small model we have today running on it. Um, and the thing that I like to show here is you know, I've brought up Helium IoT before. So, I compared this to TensorFlow Lite for microcontrollers, the open-source alternative. And not only does it AT make it more efficient and actually higher makes the throughput almost 4x greater. Uh, the memory efficiency has like 1 and 1/2 times less memory than TensorFlow Lite for microcontrollers. And energy efficiency is over four times better than TensorFlow Lite for microcontrollers.

[00:08:43]So, it's not just the silicon, it's the silicon adjacent tools that really make the difference.

[00:08:46]>> Yeah, so that compounds the the the benefit you have from using that >> It's like a multiplier on top of it, right? So, we get to optimally map it which memory is going to be stored in, you know, if we should be using MVE, which intrinsics to call out. There's a bunch of different kernels. So, doing this ahead of time we know exactly which ones to pick where TensorFlow Lite it does it at run time. So, it needs to know at run time which kernel, way more you know, overhead for that, way more memory that's needed.

[00:09:10]>> So, this will allow you to have like for example a sensor that would just transmit now this cleaned audio to wherever you need to use that further down the the chain.

[00:09:18]Um, okay. So, that's fantastic. Very amazing how it got that the baby crying.

[00:09:23]>> Yeah, yeah.

[00:09:24]>> I could have that in real life.

[00:09:25]>> Right, absolutely.

[00:09:26]>> But now let's look at the the compression, the newest addition.

[00:09:29]>> Yeah, so this is an interesting one. You know, for a lot of customers that we see they're very much wanting to do like AI on the edge, but they still find that data highly valuable to them. And you can imagine if you're doing like three channels of PPG, ECG signals, IMU data, that's a lot of data that you want to transmit back to the cloud. Right? So, even though you do local inference, they still want to transmit it. So, this is basically doing its AI-based compression. Um it's kind of based off the same thing you see from Google Meta when they did audio-based um neural-based like a neural-based codec for audio.

[00:09:58]Um but we're now applying it specifically to like more physiological signals or wearable kind of signals. So, like ECG, PPG, IMU data.

[00:10:05]Um and right now today it's loss it's lossy, but we're getting up to like 16 32x kind of compression. So, somebody can come in, they can kind of pick and choose what the right compression level they want, what the trade-off is. Um but the great thing here is, you know, if I go down to like 16x.

[00:10:21]Um and I can ramp up the noise on the signal as well.

[00:10:28]So, we can see like even with 16x compression, you can see the original and you can see the reconstructed. So, that while it's getting 17% like kind of error, you can see a lot of it's just noise artifacts. It's actually preserving the the signal you actually care about. The what it's getting in error is actually noise that you don't really want you don't want anyways.

[00:10:43]>> Yeah. And then what we're showing is if you capture this just for 5 minutes, the amount of memory on device you can save is 50% and that's taken to account the actual AI model itself. So, even with the model on there, you're saving about 50% memory.

[00:10:54]Um the bandwidth saved if you transmit over BLE is roughly 95%. So, that's a massive reduction in transmit. So, usually these devices they want to sleep, they wake up, they capture some data and they want to transmit and go back to sleep. This allows them to transmit way less um ultimately saving them power.

[00:11:09]>> Okay, fantastic.

[00:11:10]>> Yeah.

[00:11:11]>> So, this will be very useful. And that's quite nice actually how the waveform is is almost identical minus the >> minus the like the noise, yeah.

[00:11:20]>> Okay.

[00:11:20]>> Um and so, customers they can try these demos out. For a lot of them we have like a live version where they can they can click, they can connect to it, and they can stream new ECG signals, PPG signals, they can upload their own files and try it out.

[00:11:31]>> Okay, so you mentioned earlier the AI development playground or zoo.

[00:11:37]So what other means of evaluating Ambiq's technology do you have available for engineers that want to maybe use your services?

[00:11:45]>> Yeah, so today you can kind of try out any one of these demos. We're making them public, so anybody can go and try them out, see how they work.

[00:11:52]Um we also have a profiling tool part of Neural Spot called Auto Deploy. And the great thing with that is you can grab one of our evaluation boards, plug it into your computer, and use Auto Deploy, pass it the model, and it's going to evaluate it. It's going to give you per layer breakdown. It's going to show you how much power it's consuming, the PMU counters, everything you'd want to know as an engineer.

[00:12:09]>> Okay. So with Spot, you've not only created a framework to have this extremely low power silicon, but you've also then provided the means to fully utilize that low power cuz it's it's nice to have the the the framework to use that low power, but if it's not optimized, you end up losing a lot of that um >> Exactly. And that's what we're going to find that like the open source tools were fine, but they left a lot on the table. So we want to kind of say, "Hey, we got the silicon. Let's have the silicon adjacent tools to go with that, right?" Um in the cloud, you're kind of spoiled with a lot of luxuries and the runtimes you have. In the edge, it's still a bit growing. So we want to give our customers the best kind of experience and to maximize use all of that Spot capabilities.

[00:12:46]>> Um so what else do you have coming out at Ambiq?

[00:12:49]>> Yeah, I'd say the one thing I'm I'm most excited about is our new Atomic series.

[00:12:53]So today we have Apollo 510, the Apollo 5 series, which is great, has the MB accelerator like acceleration. But we're seeing for more like computer vision tasks, conversational UI, you're going to you're going to need more like ASR kind of models. So Atomic is going to be our next generation kind of silicon.

[00:13:07]Um and that's going to incorporate an NPU. So it has the Ethos U85. It also is going to have a variant with a HiFi 5.

[00:13:13]Um but on top of that, we also going to have in-package memory, up to 256 megabytes of memory. Um I think that's a game changer. So today a lot of our customers are using PS RAM, but that is very power consuming. Now they get to do all on device, and they get to go for much bigger kind of models.

[00:13:27]>> And so, that's going to be new kind of ground up.

[00:13:30]>> going to be new ground. It's still going to be M55 based. We're going to have different SKUs of it. Um, and [music] that I think is slated for a second half of next year.

[00:13:37]>> Okay, fantastic. Well, thank you very much for running us through uh these demos [music] and telling us a little bit about what you do at Ambik, and we're looking forward to this new release that you have coming up.

[00:13:47]>> [music]

#Ambiq #EdgeAI #AIModels #TinyML #UltraLowPower

Related Videos

Beyond Robotics | European Rover Challenge 2026

beyondrobotics

189 views•2026-06-01

Beatbot Sora70: JetPulse Technology and AI obstacle avoidance and navigation!

DroidModderX

26K views•2026-06-02

Tesla FSD 14.3.3 Hits Phoenix Streets - FIRST LOOK

anthonystesla

114 views•2026-05-29

Elon Musk Just Revealed Fremont Line for Optimus Gen 3 Mass Production

TheAINexusOfficial

180 views•2026-05-30

人機一体「零式人機 ver.2」子ども企画【おもしろ発見！モビリティー】 #乗り物 #automobile #robot #shorts

KyodoNews

1K views•2026-05-28

China’s New Luna AI Robot Looks Shockingly Human...

NextGenHumanoids

850 views•2026-05-28

Reachy Mini: the $300 open source robot you can actually hack — Andres Marafioti, Hugging Face

aiDotEngineer

662 views•2026-05-29

柔軟指×AI画像処理食品の仕分け作業システム！#柔軟指 #ロボット #自動化 #製造業をもっと盛り上げたい

KiQ_Robotics_Corp.

113 views•2026-05-28

Trending

Revisiting The Cat Cafe For The Final Time

BenGtalks

3195K views•2026-05-29

Lil bro is a menace 🤣

NotAirJordan

2037K views•2026-05-31

The Casino Had Us Guessing All Day

VegasMatt

157K views•2026-06-03

Political Science

My response to the Police

RecklessBen

1496K views•2026-06-01