When people hear "agentic outbound," they usually imagine one of two things. Either a fully autonomous system that runs without human involvement, or a marketing term for software that does what software has always done.
Neither is accurate. Here is what it actually looks like when the system is running well.
The Morning
I open my laptop. There is a Slack message in the campaign channel from 8:02am:
Reply check | 2026-05-24 8:02am PDT
New conversations (2):
• Sarah Chen | Positive | linkedin.com/in/sarahchen | HeyReach/BTS | Draft ready
• James Rivera | Question | james@domain.com | Instantly | Draft ready
Existing conversations (1):
• Troy Lunt | Soft Yes | linkedin.com/in/troylunt | HeyReach/BTS | Draft ready
I go to Airtable. Three records are waiting in the Reply Queue with status "Draft Ready." I read each one. The system has already read the full thread, enriched the lead profile where it could, classified the reply, and written a draft.
Sarah Chen replied with genuine curiosity. Her draft acknowledges what she said, adds one specific line based on her LinkedIn background, and ends with a soft question to keep the conversation moving. It is good. I approve it with one click.
James Rivera asked about pricing. His draft answers the question directly in two sentences, then asks which part of their current process is the bottleneck. Also good. Approve.
Troy Lunt replied "yes, go ahead." His draft is three sentences: confirms we will send the overview, acknowledges one thing he mentioned in his profile, and offers times for a call. Approve.
Total time: four minutes. Three replies reviewed and approved, now queued to send.
What the Agent Actually Runs
There are four scheduled runs that happen every day, regardless of whether I am at my laptop.
7am: Campaign health check. The agent pulls health metrics from Instantly and HeyReach for every active campaign. Bounce rates, reply rates, step completion percentages. If anything crosses a threshold, it posts a flag to the system channel. Most mornings there is nothing. When there is something, I see it before I start working.
8am: Reply handler. Full pull from all platforms. HeyReach (LinkedIn), Instantly (email), Smartlead (email, for some clients), Lemlist (for one client's LinkedIn campaigns). Every unread reply gets classified, enriched where possible, and drafted. See the full reply handler breakdown for how it works. The results post to Slack. The drafts sit in Airtable waiting for review.
2pm: Reply handler again. Same as 8am but catches anything that came in during the day. Late morning replies do not sit until the next morning.
Daily: Signal research. For clients with configured signal campaigns, the agent runs the defined searches. LinkedIn post signals via Apify, job posting searches via SearXNG, web mentions. Results get scored and added to the lead queue for review.
In between those scheduled runs, there is also an inbox safety scan that checks for anything the main runs might have missed. It runs every two hours during business hours.
What I actually do as a result of all this: review drafts in Airtable, make judgment calls on anything the system flagged as unclear, and handle the conversations that require real strategic input. The routine operations do not need me.
A Normal Week
Monday: Reply queue has the weekend backlog. Usually eight to fifteen drafts. I clear them in under twenty minutes. The agent posts the client campaign summaries to their Slack channels automatically, so clients already know what happened over the weekend before I send anything.
Tuesday through Thursday: Daily reply queues are two to six drafts each morning and afternoon. Occasional system flags when something in a campaign dips below threshold. The signal research for clients with active searches surfaces leads periodically.
Friday: Weekly review. I look at campaign performance across all clients. The agent has already pulled the metrics. I spend time thinking about what to change, not pulling the data.
What I used to spend time on that I no longer do:
- Manually checking every inbox on every platform
- Copying replies into a document to track them
- Writing follow-up messages from scratch for every positive reply
- Updating Airtable with campaign status by hand
- Sending client status update emails
What I still do that the agent does not:
- Strategic decisions about campaign pivots
- Client relationship management
- ICP refinement when a campaign is not converting
- Approving every outgoing message before it sends
- Handling edge cases the system flags as unclear
When It Goes Wrong
The system breaks in predictable ways. That is actually the key insight: it breaks in predictable ways, which means it fails visibly.
When an API goes down, the morning cron posts an error to the system channel. I see it when I open Slack. The replies are still unread on the platform, waiting for the next run.
When a classification is ambiguous, the reply stays in the queue as "Unclear" rather than generating a draft. The Slack summary flags it. I look at it manually and decide.
When a campaign dips below the health threshold, the alert fires before I would have noticed organically. I can investigate and act before it becomes a significant problem.
None of these are failures I have to discover by checking dashboards. The system tells me when something needs my attention. If the system is not telling me anything, things are running as expected.
The most useful thing a well-built agent does is not the work itself. It is knowing what to surface. A system that handles 90% automatically and routes the other 10% to you with full context is more valuable than one that tries to handle 100% and occasionally makes expensive mistakes silently.
What Clients Experience
Clients do not interact with the agent directly. They have a dedicated Slack channel where campaign updates post automatically. Every morning, the channel shows what the agent found, what happened with their active sequences, and any conversations that need attention.
Most clients stop asking for status updates within the first two weeks. The channel is the status update. They can see it anytime without asking. For the conversations that require my judgment, I handle them and post a brief summary in the same channel.
From a client's perspective: things run without requiring them to manage us. Positive replies surface in Slack with a draft ready. They do not wait for a Monday morning report to know what happened last Tuesday.
The Ceiling
There are things the agent cannot do that matter a great deal.
It cannot read the subtext of a reply that requires knowing the client's relationship history with the prospect. It cannot decide when a campaign approach is fundamentally wrong and needs to be rebuilt. It cannot build the trust that makes someone want to take a meeting.
The agent runs the operations that do not require those things. Everything else stays with me.
At current capacity, the system handles four active client campaigns. For a full look at the infrastructure behind it, see The AI Stack We Built to Run Client Outbound. My sense is the ceiling is eight to ten before client relationship management becomes the bottleneck rather than operations. The agent scales the operations. The human still needs to manage the strategy and the relationships.
Build the approval step before you automate the sending. The temptation is to close the loop and let it run fully autonomous. The cost of a misclassified reply that sends without review is always higher than the time saved by skipping approval. Get the workflow right first. Trust the automation after it has earned it.
Frequently Asked Questions
How long did it take to get here? About five months from the first broken prototype to something I trusted enough to run client campaigns on. The reply handler took three weeks to build. Signal detection took six. The hardest part was not the code. It was learning what the system needed to handle before I understood what could break.
Does the agent ever send something I did not approve? No. Every outgoing message goes through Airtable and requires my approval before the send command runs. This is non-negotiable. The approval step is there because misclassification is possible, tone is hard to get right every time, and some conversations require judgment the agent does not have.
What happens if the server goes down? Replies stay unread on the platforms. Nothing is lost. When the server comes back, the next scheduled run picks up where it left off. The dedup logic in the reply handler ensures nothing is processed twice.
What is the actual cost to run this? The Hostinger VPS is under $30 per month. Claude API costs (Haiku for classification, Sonnet for drafts) are under $20 per month at current volume. Apify costs around $29 per month. Everything else is either free (SearXNG) or covered under existing tool subscriptions. The infrastructure cost is well under $100 per month to run the agent layer.
An agent that handles 90% autonomously and routes the other 10% with full context is more valuable than one trying to handle 100% silently.
If you want to see this in action, book a call to see how we run client campaigns.
