● Insights

OpenClaw $1.3 Million OpenAI Bill: What AI Agents Actually Cost in Production

Peter Steinberger spent a decade building PSPDFKit into a PDF framework running on over a billion devices. He joined OpenAI in February 2026, saying “I want to change the world, not build a large company.” A few months later, his open-source project OpenClaw, the fastest-growing project in GitHub history with over 300,000 stars and 3.2 million users, racked up an OpenAI bill of $1,305,088.81 in a single month.

603 billion tokens. 7.6 million API requests. 100 Codex agents running simultaneously. The OpenClaw cost breakdown is the first real look at what autonomous AI agents cost in production.

That’s $13,000 per agent per month.

And OpenAI is covering the bill as a “research investment.” Regular companies don’t get that deal.

The OpenClaw Cost Breakdown

OpenClaw is a self-hosted autonomous AI assistant. It connects to your email, calendar, browser, Slack, Discord, WhatsApp, and iMessage. Agents execute shell commands, manage files, automate web tasks through a growing skill registry. The 100 agents running on Steinberger’s setup were doing real work. Reviewing pull requests, scanning commits for security vulnerabilities, deduplicating GitHub issues, writing and submitting fixes, monitoring performance benchmarks, even attending meetings and generating feature PRs.

This wasn’t a demo. This was production. The distinction matters, because every guru demo stops before the billing cycle starts.

The primary model was GPT-5.5 running in Fast Mode, which consumed tokens at higher rates. Steinberger noted that disabling Fast Mode would drop the bill to roughly $300,000 per month. A 70% reduction. Still $3,000 per agent per month at the “optimized” rate. Still $3.6 million annually.

Why This Matters More Than the Headline

The headline number is dramatic, but the per-agent cost is the real story.

$13,000 per month per agent on full pricing. $3,000 per month on optimized pricing. These aren’t projections from a whitepaper. These are invoiced numbers from someone who works at OpenAI running agents on OpenAI’s own infrastructure.

Now think about the gap between Steinberger and a newcomer. He’s an experienced engineer who built billion-device software. He has OpenAI’s internal knowledge. He has a “research investment” subsidy covering the bill. He knows to disable Fast Mode for a 70% cost reduction.

A first-time builder doesn’t know any of that. They’ll hit the high-rate pricing, run agents longer than necessary, retry failed calls without cost caps, and discover the bill at the end of the month. If Steinberger’s optimized setup costs $3,000 per agent, a newcomer’s unoptimized setup will cost more. Possibly much more.

The Guru Problem

Scroll through YouTube and LinkedIn right now. “Deploy AI agents for your business.” “Build an autonomous AI workforce.” “Replace your team with agents.” The pitch is seductive. Agents are cheap, they scale, they work while you sleep.

Nobody mentions $13,000 per month per agent.

Nobody mentions that 100 agents running GPT-5.5 burn through 603 billion tokens in 30 days. Nobody mentions that “Fast Mode” isn’t just faster, it’s dramatically more expensive. And nobody talks about how even the optimized version, built by someone who works at the company that makes the model, still costs $3.6 million per year.

The gap between what’s being sold and what’s being spent is the widest I’ve seen in tech. And it’s widest for the people with the least ability to absorb the surprise. Small businesses, indie developers, and first-time builders who took the guru at their word.

What Practitioners Already Knew

I wrote about this months ago. Autonomous AI agents look great in demos and burn cash in production. The OpenClaw numbers validate what practitioners already knew. The question was never “can agents do the work?” It was always “can you afford to let them?”

When I build AI systems, cost control isn’t an afterthought. It’s architecture. It’s why harness engineering exists as a discipline. The memory server I run has a condensation layer specifically because raw search results were burning through the context window. Hundreds of thousands of characters of raw output compressed to a few thousand. That’s not clever engineering. That’s survival. Without it, every session would have been its own version of Steinberger’s bill, just at a smaller scale.

I co-founded Aether Global Technology, a Salesforce consulting partner in Manila. When clients ask about AI agent deployment, the first conversation isn’t about what the agent can do. It’s about what the agent will cost per month, and what happens when it runs unsupervised for a weekend.

Most agent frameworks ship without cost caps, token budgets, or kill switches. The agent swarming piece I wrote covers why multi-agent coordination fails in production. The OpenClaw bill is what that failure looks like in dollars.

The Uncomfortable Math

Let’s do the math the gurus won’t.

Scenario Monthly Cost Annual Cost
1 agent (full pricing) $13,000 $156,000
1 agent (optimized) $3,000 $36,000
10 agents (optimized) $30,000 $360,000
100 agents (optimized) $300,000 $3,600,000
100 agents (full pricing) $1,300,000 $15,600,000

For context, the median annual salary for a software engineer in the Philippines is roughly $15,000-20,000. One unoptimized AI agent costs the same as a full-time senior developer. Ten agents cost more than a small engineering team.

“Replace your team with agents” stops sounding cheap when you do the multiplication.

What To Actually Do

The OpenClaw bill has four lessons that matter for anyone considering AI agents for real work.

Know your token economics before you deploy. Steinberger discovered that Fast Mode was the primary cost driver. That’s a setting. One toggle. 70% cost difference. If you don’t understand your pricing tier, your model’s token consumption pattern, and your request volume, you’re deploying blind.

Build cost controls into the architecture. Token budgets per agent, spend thresholds that trigger alerts or kill switches, session caps, retry limits. These aren’t features you add later. They’re load-bearing walls. I wrote a tutorial on building pre-action gates for exactly this kind of mechanical enforcement.

Start with one agent, not a swarm. Steinberger ran 100 agents because he could afford to (OpenAI was paying). You can’t. One agent, measured, monitored, optimized. Then scale. The architecture that prevents AI agents from taking destructive actions starts with one agent and one set of gates.

Question the subsidy. OpenAI covering Steinberger’s bill as “research investment” means these costs aren’t sustainable at market rates. When your favorite guru says “just deploy agents,” ask who’s paying the token bill. If the answer involves investor subsidies or promotional pricing, the real cost is being hidden, not eliminated.

Frequently Asked Questions

How much does it cost to run an AI agent in production?+

Based on OpenClaw’s published numbers, a single autonomous AI agent running GPT-5.5 costs approximately $13,000 per month at full pricing, or $3,000 per month with optimized settings (disabling Fast Mode). Actual costs depend on the model, token consumption patterns, and whether cost controls like retry limits and session caps are in place.

Why are AI agent costs so high?+

AI agents make many API calls per task, each consuming tokens. OpenClaw’s 100 agents generated 7.6 million API requests and consumed 603 billion tokens in 30 days. Unlike a chatbot conversation, an autonomous agent running continuously accumulates token costs around the clock. Fast Mode and retry loops multiply these costs further.

Can you reduce AI agent costs?+

Yes. Steinberger noted that disabling Fast Mode alone reduced costs by 70%. Other strategies include setting token budgets per agent, implementing spend thresholds with kill switches, routing mechanical tasks to cheaper models instead of running everything on frontier-tier pricing, and starting with a single agent before scaling.


Tom Tokita is co-founder of Aether Global Technology and writes about AI operations from Manila. He writes about what works in production.

Share this article

More Articles

  • All Posts
  • 13
  • Blog
  • Guides
  • Insights
  • Resources
Load More

End of Content.

Tokita

Reducing the noise with real-world experience — not POCs, not pitches.

© 2026 Tom Tokita. All rights reserved.Designed for readability.

Ask Tom's AI

5 of 5 remaining
Hey! I'm Tom's AI assistant. Ask me anything about AI consulting, AI operations, or building production AI systems in the Philippines. I'll answer based on Tom's published articles.

Your messages are not stored or logged. This chat is stateless — nothing is saved after you close this window. See our Privacy Policy for details.