Instala nuestra extensión para buscar dentro de cualquier video al instante

Voice AI: when is the "Her" moment? — Neil Zeghidour, Gradium AI
Añadido:

116 vistas12me gusta19:26aiDotEngineerLanzamiento original: 2026-05-09

Voice AI systems face fundamental architectural limitations: cascaded systems (speech-to-text, LLM, text-to-speech) suffer from high latency (500ms-4s for tool calls vs. 200ms for human conversation), while speech-to-speech models are still half-duplex and cannot handle natural backchanneling (overlapping speech), making them feel robotic despite sounding natural. The 'Her moment' remains elusive because current models lack paralinguistic understanding (tone, hesitation, emotional cues) and practical utility, with cost being a major barrier for scaling voice applications.

Videos Relacionados

OpenHuman VS Hermes AI: Who Wins?

JulianGoldieSEO

285 views2026-05-29

BREAKING: Microsoft’s New Image Generating Model Beat Out GPT 1.5 and Nano Banana 2

aimmediahouse

122 views2026-06-03

Long-Running Agents — Build an Agent That Never Forgets with Google ADK

suryakunju

142 views2026-05-30

I Made the Same Anime Fight Scene in Every AI Video Generator

NobleGooseAnime

295 views2026-05-30

Nvidia Bets Big On AI PCs | New Chip To Power Windows Laptops | Technology | AI Updates | N18S

cnnnews18

3K views2026-06-01

I Tested NEW Opus 4.8 on Four Projects (Updated LLM Leaderboard)

AICodingDaily

298 views2026-05-29

3D Platformer Update - NO CAPES

SolarLune

294 views2026-05-30

AI Doesn't Create Bias — It Inherits It

UXEvolved

176 views2026-06-01

Tendencias

Why Batman Lets The Joker Live 🤨

zackdfilms

9222K views2026-05-30

They're Complete Trash

penguinz0

558K views2026-06-04

Paris is in SHAMBLES right now 😭

H1T1

4053K views2026-05-31

The Dancing Plague...

HoodieGuyStories

1730K views2026-05-30