Installez notre extension pour rechercher instantanément dans n'importe quelle vidéo

Nanowhale-100m: Fascinating Implemention of DeepSeek-V4 Architecture
Ajouté :

883 vues58J'aime6:48fahdmirzaVersion originale : 2026-06-06

Nanowhale-100m is a compact 110 million parameter language model that successfully implements the DeepSeek-V4 architecture on a single GPU, demonstrating that complex transformer architectures can be miniaturized while maintaining core design principles. The model uses Multi-head Latent Attention (MLA) for query compression, a mixture of experts layer with 4 routed experts plus 1 shared expert, hyper connections with Sinkhorn routing, and an extra multi-token prediction head. Despite its small size, it requires only about 1GB of VRAM and can run on commodity hardware, making it an excellent educational tool for understanding how large language models are constructed.

Vidéos Similaires

BREAKING: Microsoft’s New Image Generating Model Beat Out GPT 1.5 and Nano Banana 2

aimmediahouse

122 views2026-06-03

Are AI deceiving us? | Roman Yampolsky, Gleb Solomin #AI #science

shortsGlebSolomin

1K views2026-06-02

Nvidia Bets Big On AI PCs | New Chip To Power Windows Laptops | Technology | AI Updates | N18S

cnnnews18

3K views2026-06-01

AI Doesn't Create Bias — It Inherits It

UXEvolved

176 views2026-06-01

Distributed Inference Challenges Explained #shorts

alexa_griffith

466 views2026-05-31

[한글자막] OpenAI @ Replay 2026 | OpenAI는 Codex로 개발 방식을 어떻게 바꾸고 있을까요?

TechBridge-KR

1K views2026-06-03

Starting & Test Driving JAKE'S Abandoned BUS from Subway Surfers | POV Restarting

RestartGaragePOV

4K views2026-06-04

Building the Future of Voice-First Sovereign AI: Sarvam & NVIDIA

NVIDIA

3K views2026-06-01

Tendances

This spider is a VAMPIRE (Kinda...)

moreparz

2764K views2026-06-02

Making Ai Choose Where I Eat

Tyrecordslol

3080K views2026-06-03

They're Complete Trash

penguinz0

558K views2026-06-04

Can AI tell what accent I’m using?? #carterpcs #tech #ai #chatgpt

actuallycarterpcs

2732K views2026-06-01