7.8

Depends

8.1

other 12 min · Mar 31, 2026

verdict

Hermes Agent for solo devs doing research and ML work; OpenClaw for teams that need 30+ messaging integrations and community-tested ecosystem depth.

Category breakdown

Ecosystem

6.0 9.0

OpenClaw has 13,700+ community skills and 30+ channels; Hermes has self-generating skills and 13 messaging platforms

Security

7.0 5.0

OpenClaw has 512 documented vulnerabilities from a Jan 2026 audit; Hermes has fewer known vulns but also far fewer auditors

Developer UX

8.0 7.0

Hermes wins for Python/ML devs; OpenClaw wins for JS ecosystem and broad community documentation

Maturity

5.0 9.0

OpenClaw has 5 months public exposure and 339K stars; Hermes has 6 weeks and is still hardening

Runtime Isolation

9.0 5.0

Hermes ships 6 terminal backends including Docker and SSH; OpenClaw is single-process with plugin-level sandboxing

Pick by use case

Solo developer / ML research

Hermes Agent

Python ecosystem fit, isolated compute backends, and memory that compounds with experimentation

Team with broad messaging needs

OpenClaw

30+ channel integrations and organizational infrastructure that a 6-week-old project cannot match

Local model deployment

Hermes Agent

Python ML tooling compatibility and HuggingFace as first-class provider in v0.5.0

Existing OpenClaw user evaluating alternatives

Hermes Agent

hermes claw migrate makes zero-cost evaluation possible

Hermes Agent vs OpenClaw — Data-Driven Depth vs Community Breadth

OpenClaw: 339K stars, 13,700+ skills, 30+ messaging channels, 2M MAU — the most community-tested agent runtime alive
Hermes Agent: 18.8K stars in six weeks, six terminal backends, FTS5 persistent memory, and a data flywheel that trains Nous Research’s own models
OpenClaw had 512 vulnerabilities found in a January 2026 audit, 8 critical, 42,900 exposed instances — documented risk with active remediation
Hermes ships hermes claw migrate to import OpenClaw settings, memories, and skills — switching cost is deliberately low
Solo dev doing research or ML work: Hermes. Team needing messaging breadth and battle-tested integrations: OpenClaw

OpenClaw’s 339K stars represent five months of production exposure, real security audits, and 2 million users who have already found the sharp edges. Hermes Agent’s 18.8K stars represent six weeks of explosive growth and a structural incentive that no other open-source agent has: the trajectories your agent runs can feed training data back to Nous Research’s own models. That is either the most honest alignment between tool and maker in the space, or a data collection pipeline wearing developer tooling as a disguise. The answer matters for your threat model, so let us work through it properly.

I want to be direct about where I land before we get into the details: OpenClaw is the safer default for teams, not because it is better software, but because 2 million users have already hit its failure modes and most of those failures are now documented. Hermes is the more interesting bet for solo developers and ML-adjacent work, precisely because its incentive structure creates a feedback loop that OpenClaw cannot replicate. Those are different things, and conflating them will lead you to the wrong tool.

What You Are Actually Comparing

These two tools look identical on a feature checklist. Both are self-hosted, model-agnostic, autonomous agents with messaging gateways, skill systems, and persistent memory. The surface similarity is real. The architecture underneath is not.

Hermes Agent is a Python 3.11+ agent runtime built by Nous Research. It launched internally in July 2025 and went public with enough momentum to hit 18.8K stars by March 23, 2026. It ships with six terminal execution backends — local, Docker, SSH, Daytona, Singularity, and Modal — which means it was designed from the start for isolated, reproducible compute environments. The research lineage shows throughout. Hermes has FTS5-backed full-text-search persistent memory with LLM summarization, agent-curated skills that self-generate after complex tasks, and a trajectory export system that integrates with Atropos for reinforcement learning experiments. v0.5.0 shipped March 28 with the subtitle “The Hardening Release” — which tells you something about where the team’s attention is right now. v0.6.0 followed two days later on March 30, adding Profiles for running multiple isolated instances with token locks and expanding the messaging platform list.

OpenClaw is a Node.js agent platform with a single-process gateway architecture and a plugin SDK that has attracted 13,729 skills on ClawHub. It supports 30+ messaging channels out of the box — WhatsApp, Telegram, Slack, Discord, Signal, iMessage, Teams, Matrix, IRC, LINE, and more. The OpenClaw Foundation is a registered 501(c)(3). The ex-founder now works at OpenAI. Two million monthly active users. Those numbers represent something concrete: organizational infrastructure, long-term maintenance commitments, and a community large enough to find bugs in production before they find you. They also represent a scale at which security incidents become serious — more on that shortly.

Runtime Architecture and Isolation

OpenClaw runs in a single Node.js process. Its gateway model is straightforward and easy to reason about. Runtime isolation between skills depends on plugin discipline, not architecture — the security boundary is a social contract enforced by coding conventions, not a kernel boundary enforced by the operating system.

Hermes gives you six execution backends precisely because isolation is a first-class concern. If you want agent code running in a Docker container or a remote SSH session or a cloud-hosted Daytona environment rather than your local shell, that is a configuration option in Hermes. In OpenClaw, that is a feature request or a custom integration you build yourself.

For agents running untrusted or generated code — which is most agents doing anything interesting — this is not a minor implementation detail. It is the difference between an architectural guarantee and a social contract. When your agent is browsing the web, calling external APIs, and executing generated shell commands, you want the execution environment to enforce limits, not just hope that skills behave.

The tradeoff is operational complexity. Six backends means six configurations to understand, six potential failure modes, and a steeper initial setup for developers who just want an agent that works. OpenClaw’s single-process model is genuinely simpler to operate and debug. Whether that simplicity is worth the isolation tradeoff depends entirely on what your agent is doing and who is running it.

Memory and Skills: Breadth vs Depth

OpenClaw’s memory is Markdown-based. Readable, portable, and simple to inspect or modify by hand. ClawHub has 13,729 community skills — breadth that took years to accumulate across a large developer community. That breadth is real value. It also comes with a real tax: the January 2026 ClawHavoc security operation found 341 malicious skills in that catalog. When your skill ecosystem is this large and community-sourced, vetting what you install is not optional — it is operational hygiene.

Hermes uses FTS5-backed persistent memory with LLM summarization. The search capability matters here: when your agent has months of accumulated memory, full-text search with relevance ranking is the difference between usable recall and noise. But the more structurally interesting feature is the self-generating skills. After completing complex tasks, Hermes agents generate and curate their own skills from what they learned. Instead of installing skills written by strangers, the agent writes its own from experience.

This is a fundamentally different philosophy about how agent capability accumulates. OpenClaw’s approach optimizes for immediate breadth — install a skill, get the capability. Hermes’s approach optimizes for compounding depth — the agent becomes more capable at the specific things you ask it to do, without accumulating capabilities you never use. For a solo developer running one agent for an extended period, the depth compounds in genuinely useful ways. For a team deploying agents across many different use cases, OpenClaw’s pre-built skill library remains the faster path to coverage.

Neither approach is wrong. They reflect different assumptions about who the primary user is and what they need the agent to do.

Model Routing and Provider Flexibility

Both tools are model-agnostic, and neither locks you to a single provider. OpenClaw routes through any provider via its gateway architecture. Hermes routes through Nous Portal (400+ models), OpenRouter, OpenAI, Anthropic, HuggingFace, GitHub Copilot, and custom endpoints — with v0.5.0 adding HuggingFace as a first-class provider alongside existing integrations.

In practice, the difference here is smaller than the feature lists suggest. If you are running local models via Ollama or similar tooling, the Python stack in Hermes plays more naturally with the ML ecosystem than Node.js does — you are already in Python, your model management tooling is Python, and the friction of context-switching between runtimes is real. That is a practical advantage for ML-adjacent development, not just a theoretical compatibility point.

The Nous Portal integration is worth noting separately: 400+ models through a single endpoint means you can switch between models for different tasks without reconfiguring your entire agent setup. Whether that convenience justifies routing through Nous Research’s infrastructure rather than directly to providers is a question your threat model should answer.

Messaging Gateway: Where OpenClaw Wins Clearly

OpenClaw has 30+ messaging integrations built over years — WhatsApp, Telegram, Slack, Discord, Signal, iMessage, Teams, Matrix, IRC, LINE, and more including legacy channels that enterprise and consumer deployments still depend on.

Hermes currently supports 13 platforms: Telegram, Discord, Slack, WhatsApp, Signal, SMS (via Twilio), Email, Home Assistant, Mattermost, Matrix, DingTalk, Feishu/Lark, and WeCom. v0.4.0 on March 23 added Signal, DingTalk, SMS, Mattermost, and Matrix. v0.6.0 on March 30 added Feishu/Lark and WeCom. They are moving at a pace that is genuinely impressive for a six-week-old public project.

But pace does not close a multi-year head start quickly. If your team communicates across iMessage, Teams, IRC, and LINE — or if your users expect channel variety across consumer and enterprise platforms — OpenClaw is the practical answer and the comparison is not close. Hermes covers the channels most developer teams actually use day-to-day. It does not yet cover the long tail that enterprise deployments require or the legacy channels that established communities depend on.

Security Posture: Honest Treatment of Both

OpenClaw’s security record requires direct engagement, not minimization. A Kaspersky audit in late January 2026 identified 512 vulnerabilities, eight classified as critical. Researchers found 42,900 exposed instances. Authentication was off by default. API keys were stored in plain text. The ClawHavoc operation identified 341 malicious skills in ClawHub. In documented worst-case scenarios, researchers could find instances where they could send messages and execute commands with full system administrator privileges, and thousands of Anthropic API keys, Telegram bot tokens, Slack accounts, and months of private chat histories were accessible to anyone who looked.

That is a serious security record. Here is the complete picture: those vulnerabilities were found, published, and are now being addressed in public. The v0.5.0 “Hardening Release” naming for Hermes is itself revealing — it implies there was something that needed hardening, and the project is young enough that a coordinated audit has not happened yet.

Fewer known vulnerabilities in Hermes does not mean fewer vulnerabilities. It means fewer people have looked. Six weeks of public exposure versus five months, 18.8K stars versus 339K stars, a small research lab versus a 501(c)(3) with an industry-wide user base — the audit surface areas are not comparable. Hermes’s undiscovered vulnerabilities are currently zero in the public record. That number will change when it attracts the scrutiny that OpenClaw has.

The practical takeaway: if you deploy either of these agents facing the public internet without deliberate security hardening, you are taking on risk. OpenClaw’s risk is documented and partially quantified. Hermes’s risk is real but currently unmeasured. For security-critical deployments, neither is production-safe out of the box. OpenClaw at least tells you exactly what you are dealing with.

Minimum hardening steps you should take with either tool:

For OpenClaw: enable authentication at the gateway level before exposing any endpoint, rotate all API keys that have ever touched an instance connected to a network, audit your installed ClawHub skills against known malicious patterns from the ClawHavoc disclosure, and restrict instance access to specific IP ranges if possible.

For Hermes: review trajectory export settings before running any agent against sensitive data or internal systems, audit which execution backends are enabled and restrict to the minimum required for your use case, and treat the project’s current hardening phase as a signal to follow release notes closely — the team is actively finding and fixing things.

The Atropos Flywheel: Honest Analysis

This is the part of the comparison that has no equivalent in OpenClaw, and it deserves direct treatment rather than a footnote.

Hermes Agent integrates with Atropos, Nous Research’s reinforcement learning training framework. Hermes is explicitly described as a platform for generating training data, running RL experiments, and exporting trajectories for fine-tuning. When you run agents with trajectory export enabled, the interaction data — the sequence of actions, tool calls, and outcomes — can be exported as training data for fine-tuning language models. The integration supports batch trajectory generation with automatic checkpointing.

Is this the most honest incentive alignment in open-source AI, or a data pipeline dressed as a developer tool?

My read: it is closer to genuine incentive alignment, with a caveat. The export is opt-in — you control whether your agent’s trajectories become training data. What Nous Research gets from wide adoption of Hermes is a diverse source of high-quality agent interaction data for training more capable models, which makes Hermes itself more capable over time. The flywheel is real and it runs in your direction, not against it. An OpenClaw skill ecosystem grows because developers manually upload things. Hermes’s training flywheel grows because agents actually do things — the feedback loop is tighter and the signal quality is higher.

The caveat is important: “opt-in” only protects you if you read the default configuration. Before running Hermes in any context where your agent handles sensitive data — internal documents, customer information, proprietary code, API keys — verify exactly what trajectory export captures, whether it is enabled by default in your version, and what Nous Research’s data retention and usage policies actually say. This is not paranoia. This is standard due diligence for any tool that can transmit data from your environment to a third party.

If you are doing pure research work or running public-facing agents on non-sensitive tasks, the Atropos integration is a genuine differentiator. Your agent’s experience training better models training better agents is a compounding advantage that OpenClaw’s architecture cannot replicate.

The Migration Path Is a Strategic Move

Hermes ships hermes claw migrate — a command that imports OpenClaw settings, memories, skills, and API keys automatically. The switching cost from OpenClaw to Hermes is deliberately low, and Nous Research built that intentionally. The goal is to give OpenClaw’s 2 million users a frictionless path to try Hermes without committing to it.

That is a smart competitive move. It is also worth noting that the converse is not necessarily true — there is no documented openclaw hermes migrate command. Before moving production workloads, verify that your migration out is as smooth as your migration in. Easy import with difficult export is a lock-in pattern that has burned developers in other ecosystems, and you should check the current state of bi-directional migration before depending on it.

The practical upside: if you are an existing OpenClaw user, evaluation is nearly zero-cost. Run hermes claw migrate, spend a week with Hermes, and decide. That is a genuinely good evaluation path and the migration command’s existence is worth real credit.

Comparison Table

Dimension	Hermes Agent	OpenClaw	Note
Performance	Docker/SSH/Modal isolation	Single-process Node.js	Hermes has architectural isolation; OpenClaw relies on plugin discipline
Ecosystem	Self-generating skills, FTS5 memory	13,729 ClawHub skills, Markdown memory	OpenClaw has breadth; Hermes compounds depth with use
Developer UX	Python 3.11+, six terminal backends	Node.js, plugin SDK	Hermes favors ML-adjacent devs; OpenClaw favors JS ecosystem
Compatibility	13 messaging platforms, hermes claw migrate	30+ messaging channels, plugin standard	OpenClaw has clear channel advantage; Hermes closing gap fast
Maturity	v0.5.0, 6 weeks public, 18.8K stars	v1.x, 5 months public, 339K stars	OpenClaw more battle-tested; Hermes shipping at high velocity
Use-Case Fit	Research, RL experiments, isolated compute	Team messaging, broad channel coverage	Different primary missions despite surface similarity

Use-Case Matrix

Scenario	Recommendation	Reasoning
Solo dev, research or ML work	Hermes Agent	Python ecosystem, isolated compute backends, FTS5 memory compounds with experimentation
Team of 5+, multiple messaging channels	OpenClaw	30+ channel integrations, organizational infrastructure, battle-tested at scale
Budget-conscious, local models via Ollama	Hermes Agent	Python ML tooling compatibility, HuggingFace first-class in v0.5.0
Security-critical enterprise deployment	Neither without hardening	OpenClaw has documented vulnerabilities; Hermes has undiscovered ones
Existing OpenClaw user wanting to evaluate	Hermes Agent	hermes claw migrate makes evaluation nearly zero-cost
Broad consumer channel coverage (iMessage, LINE, IRC)	OpenClaw	Hermes has 13 platforms; OpenClaw has 30+ including legacy channels
RL research or training data generation	Hermes Agent	Atropos integration is unique; no equivalent exists in OpenClaw
Team needing 501(c)(3)-backed long-term support	OpenClaw	Organizational guarantees a six-week-old project cannot match

Conclusion

Use Hermes Agent if:

You are a solo developer or small team doing research, RL experiments, or ML-adjacent work
You want persistent memory that compounds with use rather than a catalog of skills installed from strangers
You are already running Python tooling and want your agent runtime in the same ecosystem
You care about architectural isolation for agent code execution — Docker and SSH backends are config options, not feature requests
You want to evaluate before committing — the migration path from OpenClaw is deliberately frictionless
You are doing work where the Atropos training flywheel is a feature rather than a concern

Use OpenClaw if:

You need messaging integrations across 30+ channels, including iMessage, Teams, IRC, and LINE
You have a team that needs the organizational guarantees of a 501(c)(3)-backed project with long-term maintenance commitments
You want an ecosystem of 13,700+ community skills even if that means doing your own vetting
Your users or team members are already in channels that OpenClaw supports and Hermes does not
You need battle-tested documentation and a community large enough to have already answered your question

One thing both tools share: neither is production-safe without intentional security hardening. OpenClaw’s vulnerabilities are documented and being addressed. Hermes’s vulnerabilities have not been found yet — but they exist, and the project is young enough that the audit that reveals them has not happened. Plan for both.

The star count difference — 339K versus 18.8K — tells you about community size, not quality. What it actually tells you is that OpenClaw has been tested at a scale Hermes has not reached yet. That is either reassuring or concerning depending on which tool you are evaluating.

By dennis · Mar 31, 2026 ← all comparisons