A 7B model achieved 88.89% on GPQA Diamond without gradient steps by using the Darwin merge method, which scores parameters across multiple checkpoints by magnitude and rank to create trust-weighted averages, preserving distinctive signals while eliminating redundant parameters; this technique works only on checkpoints from the same base model family and cannot invent skills absent from all parent models.
深掘り
前提条件
- データがありません。
次のステップ
- データがありません。
深掘り
A 7B model hit 88.89% on GPQA Diamond with zero gradient steps — they averaged four checkpoints追加:
A 7-billion model just hit 89% [music] on GPQA, diamond frontier territory, and the team didn't run a single gradient step.
>> [music] >> They averaged four checkpoints. Naive averaging usually trashes both models.
So, how do four good checkpoints become one better one without any training?
Look at every parameter across all four checkpoints. Some are big and distinctive. Others are tiny and look the same [music] in every copy, basically dead weight. Darwin scores each one by magnitude [music] and rank, call it a trust score, then averages by trust. It can't invent a skill none of the parents had. [music] If all four are weak at math, the merge is weak at math.
So, if you fine-tune the same [music] base model four times for four different jobs, you're already sitting on a bigger model. Hugging Face open-sourced [music] the Darwin family on May 14th. 89 on GPQA diamond, zero gradient steps.
関連おすすめ
Agentforce NOW AMA: Build with React and Salesforce Multi-Framework
SalesforceDevs
490 views•2026-05-28
How agent o11y differs from traditional o11y — Phil Hetzel, Braintrust
aiDotEngineer
450 views•2026-05-28
Re: 🗣️📍theprophedu📍2026 GST 103 CLASS (E-EXAM REVISION)
theprophedu
636 views•2026-06-04
WEB TECHNOLOGIES UNIT-2 | Degree 4th sem BCOM Computers web technologies unit-2 full explanation💯✅
LearnwithSahera
1K views•2026-05-29
More tests are always better? How to use AI to identify tests that bring little value
Alliance4Qualification
335 views•2026-05-29
Search Algorithms Explained in 60 Seconds! 🤖💨
samarthtuliofficial
218 views•2026-06-01
People of Game of Thrones using JavaScript DOM
AltCampus
296 views•2026-05-30
Instagram accounts got PWNed
EricParker
13K views•2026-06-03











