Hermes LoopHermes Loop
Demo · checklist

Operator demo checklist13 stops from health probe to deployment env vars.

What this is
A practical 13-step path through every working capability. Status is computed live from your actual database + provider env.
What to do
Open each link in a new tab. Items show DONE when the system can verify it; ACTION/READY when there's work to do; BLOCKED when something's broken.
What happens next
When everything reads DONE or READY, you can ship — no fake green.
  1. 1
    Hermes health

    ok=true, mode=hermes, latency under ~500ms

    Live at https://openrouter.ai/api/v1

    Troubleshoot: If unreachable, check VPN / outbound network. If demo mode, the Hermes env vars are absent.

  2. 2
    Load demo workspace

    Three completed demo missions, one active schedule, sample memories.

    10 missions on file

    Troubleshoot: Click 'Load demo command center' on /settings or the dashboard. Idempotent — running twice doesn't duplicate.

  3. 3
    Run Browser QA against /demo-target

    Bug Hunter Crew, real Playwright crawl, mission reaches WAITING_APPROVAL.

    Requires `npm run dev` running and ALLOW_LOCAL_BROWSER_QA=true.

    Troubleshoot: Prod deployments can't audit localhost — point at any public URL via EVAL_BROWSER_TARGET or the form.

  4. 4
    Approve a deliverable

    Approval card on the mission page or in /approvals; status flips to APPROVED, audit row + Discord notification (if configured).

    5 approval(s) waiting

  5. 5
    Generate the receipt

    WorkflowReceipt with deterministic integrity hash, riskLevel, real cost via openrouter-live.

    4 receipt(s) on file

  6. 6
    Trust ledger updates

    Aggregate KPIs reflect the new mission. Browser QA + runtime panels pick up the run.

  7. 7
    Run the runtime terminal

    Codebase Debugger crew or sandbox call. Backend chip shows local or docker — never silent.

    7 runtime tool call(s) on file

  8. 8
    Python RPC sandbox

    Approval-gated python_rpc with backend metadata. Docker if available, otherwise host fallback flagged.

    Docker not available — host fallback only.

  9. 9
    Provider configuration matrix

    Every provider (Hermes, search, vision, image, TTS, Discord, Slack, Email, runtime) shows READY / PARTIAL / NOT_CONFIGURED.

    13 ready · 0 partial · 1 not configured

  10. 10
    Inbound integrations

    POST creates an InboxItem and queues TRIAGE_INBOX. Recent inbound table shows the item.

    Try /integrations/discord, /integrations/slack, /integrations/email — each page has a copy-pasteable curl.

  11. 11
    CLI smoke

    `npm run foundry -- health` prints {mode,ok,latency}. `jobs list`/`receipts list`/`mission create` return JSON. Exit codes are honest.

    Troubleshoot: If health fails, env isn't loaded — make sure .env.local exists.

    Ready to test
  12. 12
    Deployment env var sanity

    Every secret is in .env.local locally and in the Vercel/Railway dashboard for prod. ALLOW_LOCAL_BROWSER_QA stays unset in prod. MISSION_RUN_MODE=queued + a worker service in prod.

    Troubleshoot: See README → Deployment for the full Vercel + Railway recipe.

  13. 13
    Mission deliverables surfaced

    Each completed mission shows a deliverable (BUG_REPORT / DRAFT / TRADE_JOURNAL / REPORT) on its mission page.

    8 deliverable(s) on file

CLI smoke commands
npm run foundry -- health
npm run foundry -- jobs list --limit 3
npm run foundry -- receipts list --limit 3
npm run foundry -- mission create \
  --crew codebase-debugger \
  --title "CLI smoke" \
  --objective "Inspect repository status with safe commands and summarize."
Eval suites
npm run evals:tools     # cheap, no missions; gating + backend metadata
npm run evals           # real Hermes mission suite (~$0.005, ~70s)
# with dev server up:
npm run evals -- --crew=bug-hunter