A governance framework for multi-agent AI systems, covering oversight, accountability, inter-agent trust, and risk management for systems where multiple AI agents collaborate or compete to complete tasks.
Use this repository when you need the oversight and accountability model for multi-agent systems:
Do not start here if you need control-flow design patterns. Use agent-orchestration.
Do not start here if you need evaluation metrics and benchmark scenarios. Use agent-eval.
Do not start here if you want runnable examples. Use agent-simulator.
Single-agent AI systems are complex. Multi-agent systems compound that complexity:
Before deploying a multi-agent system, classify each agent:
| Dimension | Categories |
|---|---|
| Role | Orchestrator / Executor / Validator / Monitor |
| Trust Level | Trusted / Semi-trusted / Untrusted |
| Autonomy Level | Supervised / Semi-autonomous / Fully autonomous |
| Risk Class | High / Medium / Low |
| Data Access | Read-only / Read-write / No access |
┌─────────────────────────────────────────────────────────┐
│ ORCHESTRATOR │
│ (high trust, human-supervised) │
└──────────────────────┬──────────────────────────────────┘
│ delegates tasks + validates outputs
┌───────────┼───────────┐
▼ ▼ ▼
┌──────────┐ ┌──────────┐ ┌──────────┐
│ AGENT A │ │ AGENT B │ │ AGENT C │
│ executor │ │ executor │ │ validator│
│semi-trust│ │semi-trust│ │high-trust│
└──────────┘ └──────────┘ └──────────┘
│ │
▼ ▼
┌──────────┐ ┌──────────┐
│EXTERNAL │ │ HUMAN │
│ TOOL │ │ REVIEW │
│untrusted │ │required │
│ sandbox │ │for high │
└──────────┘ │ risk │
└──────────┘
| Gate | Timing | Requirements |
|---|---|---|
| System Design Review | Before development | agent roles, trust model, and failure modes documented |
| Individual Agent Validation | Before integration | each agent validated independently |
| Integration Testing | Before staging | end-to-end scenarios including failure injection |
| Red Team | Before production | adversarial scenarios such as prompt injection and agent hijacking |
| Human Oversight Check | Before production | human review points defined for all high-risk decision paths |
| Monitoring Activation | At deployment | per-agent and system-level monitoring configured |
For every multi-agent system, document:
| Metric | Frequency | Alerting Threshold |
|---|---|---|
| Task completion rate | Real-time | drop > 10% from baseline |
| Inter-agent latency | Real-time | P99 > configured SLA |
| Agent error rate | Real-time | error rate > 1% |
| Goal drift (planned vs. actual) | Per task | configurable by use case |
| Token usage / cost | Hourly | > 150% of daily budget |
| Human escalation rate | Daily | > 20% increase week-over-week |
Full mapping: docs/nist-rmf-mapping.md
Key NIST AI RMF categories addressed:
| Repository | What it adds |
|---|---|
agent-orchestration |
routing, delegation, validation, and failure-handling patterns |
agent-eval |
evaluation dimensions, scenarios, and reporting structure |
agent-simulator |
runnable behavior and bounded workflow examples |
governance-playbook |
broader enterprise governance operating model |
nist-rmf-guide |
practitioner guide for RMF implementation |
Maintained by Sima Bagheri · Connect on LinkedIn