Case Study — Internal Ops Co-pilot

Period2025–2026

ClientGetWebLabs R&D

Engagement12+ months live

InvestmentSelf-funded

Internal Ops Co-pilot — terminal showing the AI agent's tool calls

// 01 / THE CHALLENGE

Thirty small jobs, every single week.

Running a small consultancy meant 30+ recurring ops tasks eating evening hours: backup audits, scraper health checks, log triage, GitHub repo sweeps, EA backtest summaries, Telegram triage. None of them complex on their own — collectively, they consumed 6–8 hours every week.

Hiring an ops person wasn't justified at this scale. The right answer was a multi-tool AI co-pilot — but the off-the-shelf AI products that existed in 2025 were either too narrow (single-purpose chatbots) or too dangerous (general agents with full shell access and no guardrails). So we built one.

// 02 / OUR APPROACH

Telegram front-end. Scoped shell back-end. Hard permission boundary.

Front-end: Telegram bot, because that's where the operator already lives all day. No new UI to learn, no new tab to keep open.

Planner: Anthropic Claude as the reasoning layer — chosen for tool-use reliability and the ability to refuse ambiguous instructions instead of guessing.

Tool layer: a tightly scoped allowlist of shell commands and Python scripts. Anything destructive (file delete, repo force-push, system service restart) requires explicit operator confirmation in chat.

Permission model: two protected paths the agent can never touch, regardless of instruction — written into the system prompt AND enforced at the tool layer. Belt and braces.

// 03 / WHAT THE AGENT ACTUALLY DOES

The work it took off our plate.

✓
Recurring health checksEvery morning: status of services, recent error counts, disk space, last successful backup. One Telegram message, takes 8 seconds to read.
✓
Log triage & summarisation"Anything weird in last night's logs?" — the agent reads, clusters by pattern, summarises in plain English.
✓
Repo sweepsUncommitted changes across 7 active repos, stale branches, last-pushed-N-days-ago. Surfaces what needs attention.
✓
On-demand backtests"Run the GBP triangular-arb test on last week's tick data." Agent fires the script, waits, summarises results into chat.
✓
Inbound intent routingTelegram messages from outside operators get classified — urgent, FYI, marketing spam — and routed accordingly.
✓
Session backup automationEvery Claude session auto-archived as JSONL + human-readable transcript, with retry on failure.

// 04 / THE RESULTS

Twelve months. Still running.

12+

Months in production

200+

Tool calls / week

~6 hrs

Weekly ops time saved

Zero unauthorised actions in 12 months of operation. The permission boundary has caught the agent attempting to touch protected paths twice — both times it correctly stopped and asked for confirmation, exactly as designed.

"I built the first version as a weekend proof. A year later it's still the thing that runs my ops queue every morning. The architecture we now offer clients on the AI Co-pilot tier is the exact same one — battle-tested first on my own work."

— Founder, GetWebLabs

// 05 / STACK

What's under the hood.

Anthropic Claude Telegram Bot API Bash + Python tool layer Allowlist permission model systemd + cron orchestration JSONL session archive

Thirty small jobs, every single week.

Telegram front-end. Scoped shell back-end. Hard permission boundary.

The work it took off our plate.

Twelve months. Still running.

What's under the hood.

Want a co-pilot for your own ops?