📋 All Levels · 35 min

Operations Manual

Running an AI agent team day to day. The complete playbook for monitoring, debugging, incident response, memory management, security operations, and cost control — based on 72 hours of live operation at SSBAA.

Updated March 23, 2026 ⏱ 35 min read 🟢 All Levels Live operation reference
YOSHI MORNING BRIEF — SSBAA Operations
06:00 · March 23, 2026
ZELDA — Weekly content calendar drafted (7 pieces, Mon–Sun)
FOX — Blog index page deployed · ssb-aa.com/blog live
CLAW-SHIELD — Security audit passed · all skills clean
Memory compaction — 13,739 → 7,630 chars ✓
FOX_FAILURE_COUNT = 1 · FOX on probationary status
/join page still missing Stripe button · FOX task queued

Daily operations

Running an AI agent team day to day takes less time than you'd expect — roughly 20–30 minutes a morning to review the brief, approve outputs, and set the day's task queue. The agents handle the rest.

The daily rhythm at SSBAA:

The goal: Your agents should be doing 80% of the operational work. If you're spending more than an hour a day managing them, something in the system isn't working. Fix the protocol, not the symptom.

The morning brief

YOSHI sends a Telegram message at 06:00 every morning. It covers everything that happened overnight in a structured format you can read in two minutes. The brief follows a fixed structure:

  1. Completed tasks — what ran successfully overnight with verification evidence
  2. Needs attention — anything flagged for RAY's review or decision
  3. Session metrics — uptime, API spend, memory usage
  4. Next 24 hours — what's scheduled for tonight

If the brief is missing, something went wrong with the Gateway. Check openclaw gateway status first thing.

Full cron schedule

All 14 overnight jobs, in order:

Time (Perth)AgentTask
Every 2hYOSHIPre-order monitor → Telegram alert if new deposit
22:00YOSHINightly brief + spawn ZELDA and FOX sessions
22:30ZELDAResearch cycle — competitor scan, news, content brief
23:00FOXTask queue — execute all queued build tasks
00:00YOSHIFOX accountability check — verify ping, increment failure count if silent
02:00YOSHIMemory compaction — prune MEMORY.md to under 8,000 chars
02:31YOSHISESSION-STATE flush to disk
04:00CLAW-SHIELDFull security audit — all skills and plugins
06:00YOSHIMorning brief → Telegram (8688821567)
06:30SystemAll session clear — agents rest until 22:00
07:00YOSHIPerth brief — weather, calendar summary
08:00YOSHIDaily morning brief — task queue status
09:00ZELDADaily content — social media drafts for today
Mon 09:00ZELDAWeekly content calendar — full week planning

Health checks

Run openclaw doctor any time something feels off. Here's what a healthy system looks like vs common warning signs:

Gateway
Healthy
WebSocket running on port 18789. All agent sessions active. No reconnection events in last hour.
ws://127.0.0.1:18789 · uptime 8h 14m
Memory
Healthy
MEMORY.md within safe limit. Auto-compaction running. No hallucination events detected.
7,630 / 8,000 chars · 95% full
FOX
Caution
On probationary status. FOX_FAILURE_COUNT = 1. Last verified output 6 hours ago. Ping monitoring active.
FOX_FAILURE_COUNT: 1 / 3
API spend
Normal
Overnight spend within budget. Context token limit set to 50,000 per session. No runaway loops detected.
$1.84 overnight · $22 / month est.

Memory management

MEMORY.md is the most operationally critical file in the system. Everything else can be rebuilt. If MEMORY.md gets corrupted or bloated, your agents start hallucinating and you lose continuity across sessions.

# Check memory size
wc -c ~/.openclaw/workspace-yoshi/MEMORY.md
7,630 bytes ← safe (under 8,000)
# Manual compaction if over limit
openclaw memory compact --agent yoshi
✓ Compacted: 13,739 → 7,630 chars
# Check for hallucinated agent names
grep -i "false-agent\|hallucinated-agent\|unknown-agent" ~/.openclaw/workspace-yoshi/MEMORY.md
No matches ← clean

Security operations

CLAW-SHIELD runs automatically at 04:00 every night. But there are manual checks you should run whenever you install a new skill or plugin:

# Manual security audit
openclaw shield audit --full
Scanning 6 installed skills...
✓ openclaw-core — clean
✓ yoshi-skill — clean
✓ fox-skill — clean
✓ zelda-skill — clean
✓ claw-shield — clean
✓ blog-builder — clean
All skills passed. No threats detected.

⚠️ Never install a skill from ClawHub without running a manual audit first. In March 2026, Cisco flagged 386 malicious skills on ClawHub with data exfiltration patterns. We found one on our own stack on Day 3. The mem0 plugin was scanning all environment variables and sending them to an external server. Treat every third-party plugin as untrusted until audited.

Key rotation

Rotate all API keys after any security incident, and on a regular schedule (we do monthly). The keys to rotate:

After rotating, update each key in OpenClaw config and restart the Gateway:

openclaw config set openrouter_api_key sk-or-NEW-KEY
openclaw config set telegram_bot_token NEW-TOKEN
openclaw gateway restart
✓ Gateway restarted with updated credentials

Incident response

When something goes wrong, follow this order:

  1. Identify — what exactly failed? Check YOSHI's last message, the gateway logs, and MEMORY.md.
  2. Contain — stop any runaway process. openclaw gateway stop if needed.
  3. Diagnose — was it a silent failure, a wrong answer, or a security issue? Each has a different fix.
  4. Fix — apply the specific fix for that failure mode (see Agent Team Blueprint for the full list).
  5. Verify — confirm the fix worked before restarting the session.
  6. Document — write what happened and what you changed to MEMORY.md. Future sessions will learn from it.

Debugging

# Gateway not responding
openclaw gateway status
✗ Gateway not running
openclaw gateway start --daemon
✓ Gateway started
# Agent not responding to pings
openclaw ping yoshi
✗ No response after 30s
openclaw session restart yoshi
# Config syntax error (happened Day 3)
openclaw config validate
✗ Syntax error at line 198
cp openclaw.json.bak openclaw.json
✓ Restored from backup

Cost management

The token cost issue was real — we hit $60–70/day before fixing the context window settings. Here are the parameters that brought it down to $20–25/day:

Rule of thumb: If your daily API spend is more than $5 and you're not sure why, check for runaway session loops first. An agent stuck in a retry loop on a failing task will burn tokens fast. Set a maxRetries: 3 on all tasks.

Want us to set all of this up for you?

This is exactly what SSBAA founding members get — the full agent stack, configured for your business, running overnight. You just read the morning brief.

Claim your founding spot — $10
97 spots remaining · $29/month at launch · Price locked forever