A 7B model achieved 88.89% on GPQA Diamond without gradient steps by using the Darwin merge method, which scores parameters across multiple checkpoints by magnitude and rank to create trust-weighted averages, preserving distinctive signals while eliminating redundant parameters; this technique works only on checkpoints from the same base model family and cannot invent skills absent from all parent models.
深掘り
前提条件
- データがありません。
次のステップ
- データがありません。
深掘り
A 7B model hit 88.89% on GPQA Diamond with zero gradient steps — they averaged four checkpoints追加:
A 7-billion model just hit 89% [music] on GPQA, diamond frontier territory, and the team didn't run a single gradient step.
>> [music] >> They averaged four checkpoints. Naive averaging usually trashes both models.
So, how do four good checkpoints become one better one without any training?
Look at every parameter across all four checkpoints. Some are big and distinctive. Others are tiny and look the same [music] in every copy, basically dead weight. Darwin scores each one by magnitude [music] and rank, call it a trust score, then averages by trust. It can't invent a skill none of the parents had. [music] If all four are weak at math, the merge is weak at math.
So, if you fine-tune the same [music] base model four times for four different jobs, you're already sitting on a bigger model. Hugging Face open-sourced [music] the Darwin family on May 14th. 89 on GPQA diamond, zero gradient steps.
関連おすすめ
resume fixed instantly 😭 Comment “app”andI’ll sendyou the link #parakeetaipartnership #resumetips
Ritcareer
686 views•2026-05-31
3D Basics in C
HirschDaniel
2K views•2026-06-05
Re: 🗣️📍theprophedu📍2026 GST 103 CLASS (E-EXAM REVISION)
theprophedu
636 views•2026-06-04
Search Algorithms Explained in 60 Seconds! 🤖💨
samarthtuliofficial
218 views•2026-06-01
Making Minecraft Clone with C++ & Raylib
PecaCSLive
686 views•2026-06-04
Instagram accounts got PWNed
EricParker
13K views•2026-06-03
So What's Odin Lang Even Good For
TechOverTea
131 views•2026-06-01
🚀 BCS613C Compiler Design | Module 1 to 5 Schema Evaluation 🔥 | VTU 6th Sem 💯 #VTU #bcs613c #exam
Pranavaa-y4y
104 views•2026-06-02











