Hermes Agent masterfully disrupts the AI ecosystem by turning consumer subscriptions into high-performance local APIs, effectively bypassing the "API tax" for developers. It is a pragmatic, high-speed bridge that prioritizes user sovereignty over the restrictive walled gardens of major AI providers.
Deep Dive
Prerequisite Knowledge
- No data available.
Where to go next
- No data available.
Deep Dive
Hermes Foundational Release: One Subscription, Native Windows, 180x Faster BrowserAdded:
Hermes agent has been picking up serious momentum over the past couple of months and the numbers are kind of hard to ignore. The repo is sitting around 157,000 stars on GitHub now and most of that growth happened in just a few weeks.
People are calling it the fastest growing open source agent project of 2026 and there's a reason for that. It's not another coding tool that lives inside your editor. It runs as a persistent background agent on your own machine, talks to you through whatever messaging app you already use and quietly gets smarter the longer you keep it around. And the new release that just dropped is being called the foundation release by Nous Research. So, I want to walk through what actually changed, why people care about this project in the first place and how it stacks up against the other open source option a lot of folks are comparing it to right now, which is Open Claw.
Quick context before we get into the update itself. Hermes is built by Nous Research, MIT licensed, completely free to self-host.
The pitch is simple. Instead of starting a fresh chat every time and explaining your setup all over again, the agent remembers. It writes its own skill files when it learns how to do something new.
It runs cron jobs. It can fire off scheduled tasks while you sleep and it works with basically any model provider, which is the part that makes it different from Claude code or Codex CLI.
You can route the same agent through Claude, GPT-5, Grok, DeepSeek, Quinn, local Ollama, whatever you have access to.
That flexibility is a huge part of why adoption has been climbing so fast. Now into the actual changes. The headline feature for me is the new local proxy.
You run Hermes proxy in your terminal and Hermes spins up a local endpoint that speaks the OpenAI API format. But here's the trick. It's not backed by an OpenAI API key. It's backed by whichever subscription you're already paying for.
Things like Claude Pro, ChatGPT Pro or SuperGrok through the OAuth login you already did.
So, any other tool that expects an OpenAI compatible endpoint, like Codex CLI, Aider, Cline, Continue, your own scripts, can just point at that local URL and start working.
One subscription, every coding tool you use, no extra API key, no extra billing.
That's a quietly huge cost story for indie devs and small teams who don't want to juggle four different bills for the same model.
The XAI integration is the other piece getting a lot of attention.
SuperGrok now works as an OAuth provider inside Hermes, and Grok 4.3 just bumped to a 1 million token context window through this path. Sign in with your XAI account, no API key, no separate billing system. You also get the full Grok stack, text chat, text-to-speech, image generation, and X search, which is now a first-class tool inside Hermes for searching X directly from a prompt. That last one matters more than it sounds.
If you're running an autonomous research agent that needs to track what's happening on Twitter in real time, you don't need to wire up a third-party scraper anymore. The agent just searches X natively with your OAuth login or an API key, whichever you have. For folks who want to actually install this, the new version is also a real PyPI package now. So, instead of cloning the repo or running a shell installer, you just run pip install Hermes agent, and then type Hermes. That's it. To update an existing install, Hermes update fetches the latest release. And on Windows, this version is the first one with proper native support. No more WSL workarounds.
Hermes runs directly on cmd.exe and PowerShell now with a real PowerShell installer that handles MinGW, the Microsoft Store Python stub thing, and the foreground control-C dance. It's still flagged as early beta because around 40 Windows specific fixes already landed inside the same release window.
So, expect rough edges, but the basic loop works end to end on a clean Windows machine.
Performance is where this release puts in real work. Cold start dropped by about 19 seconds on average. The way they got there is mostly deferred loading. Heavy adapters only load when you actually use them. Model catalogs hit a local disk cache before reaching out to the network. Doctor checks now run in parallel.
There's a specific stat in the change log that sticks out to me. The Hermes tools all platform screen went from 14 seconds to under 1.5 seconds. Browser automation got an even bigger jump.
The browser console tool now shares one persistent Chrome DevTools protocol connection instead of opening a fresh session for every single call. And the result is up to 180 times faster on those evaluations. If you've ever watched an agent waste 30 seconds spinning up a browser just to click one button, you know why this matters.
There's also a new cross session one hour prompt cache for Claude when you go through Anthropic, open router, or news portal.
What this means in plain terms, the system prompt, your skills, your memory, all of that stays warm for an hour.
Start a new session right after closing one, and the first response comes back faster and noticeably cheaper because it's hitting the cache. Background memory review hits the same cache, so the agent isn't paying full price every turn just to think about what it learned last week. The handoff command got a real upgrade, too. {slash} handoff used to be a soft transfer.
Now it actually moves your live session, every message, every tool call, every bit of context to a different model, persona, or profile without dropping anything. So, you can start a task on a fast, cheap model. Then mid-debug pass it to a deeper reasoning model. Then maybe hand it to a separate persona profile for refinement. No restarts, no copy-paste, no losing track of where you were.
For long autonomous workflows that span hours or days, this is the kind of feature you don't appreciate until you've watched a 6-hour session reset because you wanted to switch models.
Video generation now has a unified video Jira tool with pluggable provider backends, which means new video models can be added as one-file plugins instead of requiring a fork. Vision analysis got smarter when the active model can actually see, like GPT-5 or Claude or Gemini or Grok vision. The agent passes raw pixels straight through instead of converting to a text summary first. So you get actual visual reasoning from the model, not a degraded text round trip. A few other things worth calling out. LSP semantic diagnostics now run on every file right. So when the agent edits code, a real language server runs against the file and surfaces type errors, undefined symbols, and missing imports back to the agent before its next turn. That's a step beyond the basic linting from version 0.13.0 because it's actual semantic analysis from the same language servers your IDE uses.
Per-turn file mutation verifier is another small but meaningful one. After every turn that touches files, the agent gets a short footer showing exactly what changed on disk. So it catches its own mistakes when a write didn't actually land instead of confidently telling you it added a function when the file is unchanged.
Discord got history backfill turned on by default, which means when Hermes joins a channel for the first time, it reads recent messages before it responds. So no more "What are we talking about?" when you add the bot to a thread that's already mid-conversation.
Telegram and Discord also got native button UI for the clarify tool. So, when the agent asks you a multiple-choice question, you tap a button instead of typing the option number back.
Two new messaging platforms also landed.
Line, which is huge in Japan, Korea, and Taiwan. And SimpleX Chat, which is the privacy-focused, decentralized one with no user IDs. That brings Hermes to 22 messaging platforms total.
Now, about the Open Claw comparison, since this is the thing a lot of people are weighing right now. Open Claw has the bigger ecosystem, around 350,000 GitHub stars, a marketplace with thousands of community skills, and an enterprise fork from Nvidia called Nemo Claw.
So, on raw community size, it's still ahead. But, the security story has been rough. Open Claw took nine CVEs in 4 days back in March, and a supply chain audit of the Claw Hub marketplace found that roughly 12% of skills in the initial scan were malicious.
So, if you're running this stuff on a machine that has access to your code, your accounts, or your money, that matters.
Hermes shipped a supply chain advisory checker in this release that scans every install for unsafe versions, plus pseudo brute-force blocking, closed three known dangerous command bypasses, and now sanitizes tool error strings before they go back into model context, so a malicious file can't smuggle instructions to the agent through error output. The other real difference is what they optimize for. Open Claw is built around static skills you install and curate yourself. Hermes writes its own skill modules based on what it runs into. Memory in Open Claw is file-based.
You can open each memory entry and edit it.
Memory in Hermes is SQLite backed with full-text search and an actual user model built up over time, which is less transparent, but compounds more. There's even a Hermes Claw migrate command shipped to pull people directly from open claw, which tells you how seriously news is going after that audience.
I'll keep covering the smaller features as I get hands-on with them because the change log for this release runs deep.
808 commits, 633 merged pull requests, 215 community contributors. Links to everything I mentioned are below. Thanks for watching.
Related Videos
OpenHuman VS Hermes AI: Who Wins?
JulianGoldieSEO
285 views•2026-05-29
BREAKING: Microsoft’s New Image Generating Model Beat Out GPT 1.5 and Nano Banana 2
aimmediahouse
122 views•2026-06-03
Long-Running Agents — Build an Agent That Never Forgets with Google ADK
suryakunju
142 views•2026-05-30
This computer is made from real human brain cells. And you can buy it.
Talktmsmedia
3K views•2026-05-28
I Made the Same Anime Fight Scene in Every AI Video Generator
NobleGooseAnime
295 views•2026-05-30
Nvidia Bets Big On AI PCs | New Chip To Power Windows Laptops | Technology | AI Updates | N18S
cnnnews18
3K views•2026-06-01
I Tested NEW Opus 4.8 on Four Projects (Updated LLM Leaderboard)
AICodingDaily
298 views•2026-05-29
3D Platformer Update - NO CAPES
SolarLune
294 views•2026-05-30











