Period2025–2026
ClientGetWebLabs R&D
Engagement12+ months live
InvestmentSelf-funded
Internal Ops Co-pilot — terminal showing the AI agent's tool calls
// 01 / THE CHALLENGE

Thirty small jobs, every single week.

Running a small consultancy meant 30+ recurring ops tasks eating evening hours: backup audits, scraper health checks, log triage, GitHub repo sweeps, EA backtest summaries, Telegram triage. None of them complex on their own — collectively, they consumed 6–8 hours every week.

Hiring an ops person wasn't justified at this scale. The right answer was a multi-tool AI co-pilot — but the off-the-shelf AI products that existed in 2025 were either too narrow (single-purpose chatbots) or too dangerous (general agents with full shell access and no guardrails). So we built one.

// 02 / OUR APPROACH

Telegram front-end. Scoped shell back-end. Hard permission boundary.

Front-end: Telegram bot, because that's where the operator already lives all day. No new UI to learn, no new tab to keep open.

Planner: Anthropic Claude as the reasoning layer — chosen for tool-use reliability and the ability to refuse ambiguous instructions instead of guessing.

Tool layer: a tightly scoped allowlist of shell commands and Python scripts. Anything destructive (file delete, repo force-push, system service restart) requires explicit operator confirmation in chat.

Permission model: two protected paths the agent can never touch, regardless of instruction — written into the system prompt AND enforced at the tool layer. Belt and braces.

// 03 / WHAT THE AGENT ACTUALLY DOES

The work it took off our plate.

  • Recurring health checksEvery morning: status of services, recent error counts, disk space, last successful backup. One Telegram message, takes 8 seconds to read.
  • Log triage & summarisation"Anything weird in last night's logs?" — the agent reads, clusters by pattern, summarises in plain English.
  • Repo sweepsUncommitted changes across 7 active repos, stale branches, last-pushed-N-days-ago. Surfaces what needs attention.
  • On-demand backtests"Run the GBP triangular-arb test on last week's tick data." Agent fires the script, waits, summarises results into chat.
  • Inbound intent routingTelegram messages from outside operators get classified — urgent, FYI, marketing spam — and routed accordingly.
  • Session backup automationEvery Claude session auto-archived as JSONL + human-readable transcript, with retry on failure.
// 04 / THE RESULTS

Twelve months. Still running.

12+
Months in production
200+
Tool calls / week
~6 hrs
Weekly ops time saved

Zero unauthorised actions in 12 months of operation. The permission boundary has caught the agent attempting to touch protected paths twice — both times it correctly stopped and asked for confirmation, exactly as designed.

"I built the first version as a weekend proof. A year later it's still the thing that runs my ops queue every morning. The architecture we now offer clients on the AI Co-pilot tier is the exact same one — battle-tested first on my own work."

— Founder, GetWebLabs
// 05 / STACK

What's under the hood.

Anthropic Claude Telegram Bot API Bash + Python tool layer Allowlist permission model systemd + cron orchestration JSONL session archive
// READY?

Want a co-pilot for your own ops?

Same architecture, scoped to your business. We can map it in one discovery call.

Discuss an AI Co-pilot