Back to Blog
AI Agent

AI Agents Are Evolving from Toys to Tools: Who Will Define the Next Generation Interface Standard?

Two months ago, AI agents were in a predictable state:

  • They could write scripts but would forget them once finished
  • Asking them to handle a complex task would elicit "I need more context"
  • Every conversation felt like starting from scratch, requiring you to re-explain your needs

But the landscape has shifted.

Hermes Agent has exploded on GitHub with 154K stars, enabling 24/7 autonomous task execution. Its three-layer memory system allows it to self-evolve its skills. OpenAI's Codex can import an entire codebase and fix a bug in 30 minutes that might take a human 2.5 hours. Anthropic has released 10 pre-built agents for the financial services sector, tackling revenue-linked tasks from business plans to credit memos.

AI agents are no longer a future concept; they are the subject of a current power struggle.

Open-source projects, Big Tech, and startups are all battling for dominance across three key domains: Memory Modules, Multi-Agent Collaboration, and Enterprise Workflows.

Whoever controls these areas will define the standard for next-generation AI interaction.


The Explosive Growth of the Open-Source Camp

On May 14, Nous Research tweeted:

Hermes Agent has reached #1 in token usage on OpenRouter.

The accompanying image showed Hermes's GitHub page with 154K stars, over 7K likes, and more than 2.9M views.

<!-- IMAGE: articles/images/hermes-github-stats.jpg -->

This isn't just open-source hype. High placement on OpenRouter's token usage charts indicates developers are using it intensively in real-world use cases—a result based on practical utility.

The success of Hermes Agent boils down to three main factors:

1. Three-Layer Memory Architecture

Short-term cache + persistent storage + self-evolving skill library. In simple terms:

  • It remembers the immediate conversation.
  • It remembers what you discussed last week.
  • It saves newly learned skills for direct use next time.

Two months ago, an agent's memory would be wiped after a conversation. Today's Hermes can apply what you teach it in future interactions.

2. Multi-Profile Support

A single agent can switch between multiple personas or domains of expertise. You can toggle between "Python Expert Mode," "Data Analysis Mode," or "Writing Assistant Mode." This isn't just about changing the prompt; it actually loads different skill trees.

3. Tool Integration

It can call external APIs, generate videos, and manipulate files. Through its HyperFrames skill, Hermes can generate complete videos from natural language. This is implemented as a native capability, not just calling an external API.

Sources:

The fact that open-source projects have reached this level means Big Tech can no longer dominate simply by having "more resources." Users choose what works best.


Big Tech's Strategies: Four Different Approaches

Big Tech isn't standing by while open-source seizes the initiative. Interestingly, the strategies of the four major players are quite distinct.

OpenAI: Enterprise-First, Security-Led

OpenAI's strategy is clear: "Secure enterprise customers first, then expand to consumers."

On April 15, OpenAI updated its Agents SDK with three critical features:

  • Native Sandbox: Agents can execute code without crashing the system.
  • File Checks: Scans uploaded files to prevent injection attacks.
  • Memory Recovery for Long-Term Tasks: If execution is interrupted, it can resume from a breakpoint.

These are the points enterprise customers care about most. Individual users might not mind, but they are essential for customers at the scale of Walmart.

On the same day, OpenAI released GPT-5.5, natively supporting a multi-agent system where a main agent can delegate tasks to multiple specialized agents.

Sources:

Anthropic: The Pursuit of Reliability

Anthropic's approach is more aggressive. Its strategy is to offer cloud-managed "Managed Agents" directly.

Users don't need to deploy, manage scaling, or worry about security themselves. Anthropic hosts everything; users just utilize it.

The accompanying features are powerful:

  • "Dreaming": The agent proactively reviews past conversations and updates its memory. It's not passive saving, but active organization.
  • Outcomes: Success is determined based on defined evaluation criteria. The user defines "success," and the agent works toward that goal.
  • 10 Finance-Specific Pre-Built Agents: Covering high-frequency use cases in the finance industry, such as business plans, credit memos, and risk assessments.

According to the Wall Street Journal, Anthropic's financial services agents are already deployed and running as production systems, not just demos.

Google: The Platform Play

Google consistently adopts a "build the platform, let others build on top" strategy.

At Cloud Next in April, Google announced the Gemini Enterprise Agent Platform:

  • Agent Studio: Visually orchestrate agent workflows.
  • Governance and Security: Enterprise-grade permissions management and audit logs.
  • Integration with Vertex AI: Seamless connection with existing Google Cloud services.

Simultaneously, it released "Gemma 4," an open-source model optimized for agent workflows. This signals an intent to capture the segment that prefers open-source solutions.

Sources:

Meta: Penetrating the Consumer Segment

Meta's strategy is the most unconventional: "Target consumers and conquer shopping and social media scenarios."

Reuters reports Meta is internally testing an agent called "Hatch" and integrating it into Instagram and WhatsApp. The concept is that if you see a piece of clothing you like on Instagram, the agent can complete the order for you.

At the same time, to reduce reliance on Llama, it is developing its own Muse Spark model. The desire to own a proprietary model, rather than being constrained by open-source, is evident.


Three Critical Battlegrounds

The battle between Big Tech and open-source is essentially over these three domains.

1. Memory Modules

Why it matters: An agent without memory is stuck in a perpetual "first meeting" state.

Imagine if your coworker forgot everything they ever said to you every time you spoke. That would be unbearable.

Technical approaches are mainly moving in three directions:

  • Hermes: Three-layer architecture (cache + persistent + evolutionary).
  • OpenAI: Native memory recovery and breakpoint resumption.
  • Anthropic: Self-reflection and active organization via "Dreaming."

The memory module is the foundation of an agent's "personality." Whoever sets the standard here will control an agent's "continuity."

2. Multi-Agent Collaboration

Why it matters: Complex tasks require division of labor.

Just as one person can't do the work of an entire team, neither can a single agent.

Notable examples:

  • NVIDIA: Multi-agent supply chain optimization with cuOpt. Uses LangChain for orchestration to automatically plan logistics routes.
  • Research Papers: Highlight the "sovereignty gap" problem in multi-agent systems, where agents constrain each other and fail to reach the correct solution.

Sources:

Multi-agent collaboration is the agent's "organizational structure." Solving this coordination problem will enable agents to handle increasingly complex tasks.

3. Enterprise Workflows

Why it matters: It's the shortest path to revenue.

Open-source can win developer support, but the real money is with enterprise customers.

What the major players are doing:

  • OpenAI: Commerce agent partnership with Walmart.
  • Anthropic: 10 pre-built agents for financial services.
  • Google: Enterprise governance, security, and orchestration platform.

Enterprise workflows are the agent's "commercialization path." The player who secures the earliest enterprise customers will gain the cash flow needed for continuous improvement.


The Community's Competitive Play: GitHub Stars vs. Funding

How do open-source projects compete with Big Tech's ecosystems?

Hermes provided an answer with the "Hermes Agent Challenge."

The rules are simple:

  • Build something useful with Hermes or share your experience.
  • Prize: Awards worth $1,000.
  • Goal: Capture developer mindshare and build an ecosystem.

Source: https://x.com/ThePracticalDev/status/2055320434850029813

This is a clever strategy. While $1,000 isn't a huge sum, it encourages many developers to try, share, and build projects. Community ecosystems are accelerated this way.

While Big Tech grabs the market through enterprise contracts, open-source fights for the ecosystem through community challenges. The approaches differ, but they are targeting the same territories.


What's Available to Us Now

Concretely, what features are available right now? Here are three examples.

1. Code Repair

Import an entire project into OpenAI Codex, and it can fix a bug in 30 minutes that would take 2.5 hours for a human. This isn't a future feature; it's available now.

2. Video Generation

Using Hermes's HyperFrames skill, you can generate complete videos with natural language. No need to learn editing software—just give instructions.

3. Supply Chain Optimization

NVIDIA's cuOpt multi-agent system automatically plans logistics routes. While it's an enterprise application, the principle is the same: "executing complex tasks via multi-agent collaboration."


Key Metrics to Watch for Late 2026

The power map is drawn. Now it's about who can actually claim the territory.

Three metrics to watch:

1. Will Hermes surpass 100K stars and become entrenched?

If Hermes becomes the standard open-source agent, it will prove the community has the capability to define the next-generation interaction paradigm.

2. How many early enterprise customers can Big Tech platforms secure?

Among OpenAI, Anthropic, and Google, who will be the first to secure over 10 Fortune 500 clients? This will be key to gaining a first-mover advantage.

3. Will the "sovereignty gap" problem in multi-agent collaboration be solved?

If multi-agent systems can collaborate stably, agents can handle even more complex tasks. If not, they'll remain mere "toys."


The battle for dominance in AI agents has only just begun.

Two months ago, agents were experimental toys. Today, they are productive tools.

What comes next? Let's wait and see.

Comments (0)

Share:XHatena

Post a Comment

Loading...