LFM2.5-8B-A1B is a mixture of experts (MoE) model specifically designed for local agentic workflows, featuring state-of-the-art performance on agentic benchmarks like TOW bench, Telecom, Berkeley function calling, and multi-turn instruction following. The model is optimized for CPU inference, achieving hundreds of tokens per second on laptops, and extends tokenizer support for non-Latin languages with a 120% improvement in Hindi token efficiency. It offers zero-day compatibility with multiple inference engines including Llama.cpp, MLX, and VLM, and performs efficiently across AMD, Apple, and Qualcomm devices as well as cloud environments.
深度探索
先修知识
- 暂无数据。
后续步骤
- 暂无数据。
深度探索
Releasing LFM2.5-8B-A1B, a model for local agents本站添加:
Today we are releasing our LFM 2.5 8B mixture of experts model. This model is made for one specific purpose local agentic workflows.
The model performs state-of-the-art in many benchmarks that are relevant for agentic use cases such as the tow bench telecom that we have here the Berkeley function calling uh eval as well as multi-turn instruction following benchmark.
This model is based on the LFM2 mix of expert architecture which is extremely fast and optimized for CPU inference. So on your laptop you can run this model with like hundreds of tokens per second.
Interesting for this release is that we extended the tokenizer of these models to support non-Latin languages much better. For instance in Hindi we saw 120% increase and improvement in token efficiency of our tokenizer.
We have zero day support for these models with many popular inference engines including our own leap uh edge SDK Llama CPP MLX VLM sang for hosted inference and many others. We see the model performs topnotch in terms of speed on AMD devices on Apple devices on Qualcomm devices as as well as hosted in the cloud using SG lang as inference engine. If you have aic use cases that require high volume low cost this model might be for you. We also have an open-source desktop app called local cowwork where you can check out this model in action. As always, the model is open on hugging face available for everybody.
相关推荐
OpenHuman VS Hermes AI: Who Wins?
JulianGoldieSEO
285 views•2026-05-29
BREAKING: Microsoft’s New Image Generating Model Beat Out GPT 1.5 and Nano Banana 2
aimmediahouse
122 views•2026-06-03
Long-Running Agents — Build an Agent That Never Forgets with Google ADK
suryakunju
142 views•2026-05-30
If You Could Cure Depression Using Parent Logic | Gacha Life #Shorts
SimplyTashaOfficial
45K views•2026-05-31
I Made the Same Anime Fight Scene in Every AI Video Generator
NobleGooseAnime
295 views•2026-05-30
Nvidia Bets Big On AI PCs | New Chip To Power Windows Laptops | Technology | AI Updates | N18S
cnnnews18
3K views•2026-06-01
I Tested NEW Opus 4.8 on Four Projects (Updated LLM Leaderboard)
AICodingDaily
298 views•2026-05-29
3D Platformer Update - NO CAPES
SolarLune
294 views•2026-05-30
热门趋势
Why Batman Lets The Joker Live 🤨
zackdfilms
9222K views•2026-05-30
They're Complete Trash
penguinz0
558K views•2026-06-04
The Murder of Deputy Caleb Conley
MidwestSafety
810K views•2026-06-04
I Bought FAKE HopeScope Merch (and paid a subscriber to give it a makeover) | Hopeful Hauls
HangWithHopescope
158K views•2026-06-04











