I thought I had the cost model figured out.
I had switched the primary model to Haiku. I had disabled the nightly synthesis process. I was running a pull-based cron schedule instead of expensive polling. By my estimate, the whole system should have cost under $10 a month in AI API fees.
Then I opened the Anthropic console and saw $5 spent that day. On Sonnet 4.6. From a server I had configured to use a different model entirely.
That was the third billing mistake I made while building our agency's AI agent layer. Here is a full account of all three: the exact root cause of each, and the specific things I now check after any setup to make sure it does not happen again.
Mistake One: The Dreaming System I Did Not Know About
The memory plugin in OpenClaw ships with a feature called the dreaming system. It runs nightly at 3am. It takes your daily memory logs and synthesizes them into long-term insights across three phases: light, REM, and deep.
The synthesis model is hardcoded to OpenAI. gpt-4.1 for light and REM. gpt-5.2 for deep. There is no configuration option. The plugin's schema has no model field. It runs, every night, on OpenAI.
I did not know. I had not looked at the OpenAI usage dashboard since onboarding. The first sign was a $97 charge on a platform I thought I was barely using.
When I dug in: 102 million tokens on gpt-5.2. From a process I never intentionally enabled, running at 3am while I slept.
If you self-host OpenClaw and enable the memory-core plugin, the dreaming system is on by default. It bills OpenAI silently. Check your OpenAI usage dashboard before anything else.
The fix was three steps. Disable dreaming in the config:
"plugins": {
"entries": {
"memory-core": {
"config": { "dreaming": { "enabled": false } }
}
}
}
Then delete the OpenAI key from the server environment. Then revoke it at source in the OpenAI platform dashboard because even after removing it from the server, I wanted a hard stop regardless of what state anything was in.
The daily memory logs still accumulate. Nothing is lost. I run openclaw memory promote manually once a month when I want the synthesis. One-off cost, five dollars. I decide when it happens.
Mistake Two: Running Local and Cloud at the Same Time
When I moved OpenClaw from my Mac to the Hostinger VPS, I kept the local instance running. The reasoning felt sound: use local for testing, let Hostinger handle production.
The problem is that OpenClaw's Slack integration uses Socket Mode. One App Token, one connection. When both instances were running, they competed for the connection. Whichever one reconnected last would win, briefly. Then the other would reconnect. The Slack channel flip-flopped between two agents in completely different states.
Beyond the connection conflict, the two instances were reading different workspace files. Local had the full workspace I had built. The Hostinger container had an eight-file bootstrap from initial setup. When the server-side agent ran a skill, it was reading a different version of that skill than the one I was editing locally. I spent two weeks assuming the skill files had bugs before I understood what was actually happening.
Both instances were also billing separately. Every API call the local instance made went to the same Anthropic account. I had effectively doubled my costs without noticing.
The fix: stop local permanently with openclaw gateway stop. Rsync the full local workspace into the container. Add that rsync to the deploy pipeline so every git push keeps the container workspace in sync. One agent, one workspace, one bill.
Mistake Three: Two Config Files, One That Actually Matters
This is the one that stings most, because I thought I had already fixed it.
OpenClaw has two completely separate openclaw.json files on the server. This is not documented anywhere obvious:
I spent months editing the workspace config to control the primary model. Changed it to Haiku. Then to Gemini Flash. The config file looked right. The deploy pipeline confirmed the change was on the server. And the container kept using Sonnet 4.6 without logging a word about it.
The workspace config's model setting is overridden by the system config. No warning. No log line that says "ignoring workspace model." The system config just quietly wins, every time.
The system config had claude-sonnet-4-6 set as the primary model since April 10, the day the Hostinger onboarding wizard ran. Every cron run for 47 days: reply checks, campaign monitoring, follow-up scheduling. All billing Sonnet.
The fix is a direct patch to the system config:
docker exec openclaw-vsg4-openclaw-1 python3 -c "
import json
with open('/data/.openclaw/openclaw.json', 'r') as f:
config = json.load(f)
config['agents']['defaults']['model']['primary'] = 'google/gemini-2.5-flash'
with open('/data/.openclaw/openclaw.json', 'w') as f:
json.dump(config, f, indent=4)
"
And then add that patch to the deploy pipeline, so every git push enforces the correct model on the system config automatically.
What I Verify Now After Any Setup
These three mistakes share one pattern: something was running in a way I did not intend, with no indication anything was wrong. The only way to catch it was to audit the actual runtime state, not the config I thought I had deployed.
Now I check these things explicitly after any new setup or config change:
Anthropic console, OpenAI platform, Google AI Studio. Threshold low enough to catch a runaway process within one day. If I had done this from the start, mistake one would have cost a single day of billing.
After any model change, SSH in and read the file the container actually uses: docker exec openclaw-vsg4-openclaw-1 python3 -c "import json; c=json.load(open('/data/.openclaw/openclaw.json')); print(c['agents']['defaults']['model']['primary'])". Not the workspace config. The system config.
Filter by the server's API key specifically. If you see a model you did not configure, the model override is coming from somewhere in the runtime, not your config files.
One App Token, one active connection. If you run local and server simultaneously, they conflict and both bill. Local is for editing files. The server is for operations.
Do not rely on a config file that might be silently overridden. Patch the system config as part of every deployment. If it drifts between pushes, the next deploy corrects it automatically.
The cost of building on top of a platform you do not fully understand is measured in dollars, not just time. Every hidden default has a price. The only way to find them is to audit actual spend against intended configuration, regularly, before it adds up.
The total across all three mistakes is somewhere between $250 and $350 depending on how far back the dreaming system ran before I noticed. Real money. Every dollar came from a setting I did not set, a file I did not know existed, or two instances I thought were redundant.
Build the billing alerts first. Verify the actual runtime config, not the one you think you deployed. Then trust the system.
FAQ
Does OpenClaw's workspace config control anything important?
Yes, but not the primary model. The workspace config controls memory search paths, extra context files, and workspace-level agent settings. For model selection, only the system config at /data/.openclaw/openclaw.json matters. Change both if you want the change to actually take effect.
How do I disable the dreaming system if I already have costs?
Set plugins.entries.memory-core.config.dreaming.enabled = false in your workspace openclaw.json, then also remove or revoke your OpenAI key. Revoke it at the source in the OpenAI platform dashboard that is the hard stop regardless of server state.
What model should I use for automated cron tasks? For structured ops like reply classification, campaign analysis, and follow-up logic, Gemini 2.5 Flash performs on par with Claude Haiku at near-zero cost. For draft writing and nuanced reasoning, Sonnet is stronger. The architecture we run now uses Gemini Flash for all automated crons and reserves Sonnet for tasks that need it.
Can I run OpenClaw locally and on a server at the same time?
Not safely with Slack. Socket Mode allows one connection per App Token. If both instances are running, they compete for the connection and both agents operate in different states. Stop local with openclaw gateway stop before relying on the server.
How do I verify which model the container is actually using?
SSH into the server and run: docker exec openclaw-vsg4-openclaw-1 python3 -c "import json; c=json.load(open('/data/.openclaw/openclaw.json')); print(c['agents']['defaults']['model']['primary'])". That reads the file the container actually uses, not the workspace config.
