Advanced multi-agent coordination for managing AI agent pods and human teams. Includes agent architectures, communication patterns (pub/sub, message queues), task distribution, consensus mechanisms, conflict resolution, agent specialization, collaborative problem-solving, shared state management, lifecycle management, and multi-agent observability. Supports LangGraph, AutoGen, CrewAI, and distributed systems patterns.
Installation
Details
Usage
After installing, this skill will be available to your AI coding assistant.
Verify installation:
npx agent-skills-cli listSkill Instructions
name: multi-agent-coordination-framework description: Advanced multi-agent coordination for managing AI agent pods and human teams. Includes agent architectures, communication patterns (pub/sub, message queues), task distribution, consensus mechanisms, conflict resolution, agent specialization, collaborative problem-solving, shared state management, lifecycle management, and multi-agent observability. Supports LangGraph, AutoGen, CrewAI, and distributed systems patterns. allowed-tools: [Read, Write, Edit, Bash, Glob, Grep, WebFetch]
Multi-Agent Coordination Framework
Purpose
Managing multiple AI agents or human teams requires sophisticated coordination mechanisms. This Skill provides comprehensive capabilities for:
- Multi-Agent System Architectures - Hub-spoke, peer-to-peer, hierarchical coordination
- Agent Communication Patterns - Pub/sub, message queues, direct messaging, broadcast
- Task Distribution Algorithms - Load balancing, capability-based routing, priority queues
- Consensus and Voting Mechanisms - Agreement protocols, quorum-based decisions
- Conflict Resolution - Handling disagreements, resource contention, priority conflicts
- Agent Specialization and Routing - Role-based agents, skill matching, dynamic routing
- Collaborative Problem-Solving - Multi-agent reasoning, distributed search, collective intelligence
- Shared State Management - Distributed state, CRDTs, event sourcing
- Agent Lifecycle Management - Registration, health checks, scaling, retirement
- Multi-Agent Debugging and Observability - Distributed tracing, agent metrics, visualization
When to Use This Skill
Use this skill when you need to:
- Build multi-agent AI systems with specialized agents
- Coordinate human teams with AI assistance
- Implement distributed problem-solving requiring multiple perspectives
- Design complex workflows requiring agent collaboration
- Create agent swarms for parallel processing
- Implement human-in-the-loop AI systems
- Orchestrate multi-model systems (GPT-4, Claude, local models)
- Build competitive agent systems (agents voting/competing)
- Design hierarchical agent organizations
- Create agent mesh networks for resilience
- Implement collaborative code review by multiple agents
- Build ensemble AI systems for improved accuracy
Quick Start
1. Choose Your Architecture
Start by selecting the right architecture for your use case:
-
Hub-Spoke (Centralized) - Simple coordinator routes tasks to specialized agents
- Use when: Single point of coordination is acceptable, simple debugging needed
- Example: Supervisor agent coordinating code review by specialized agents
-
Peer-to-Peer (Distributed) - Agents communicate directly without central coordinator
- Use when: High availability needed, no single point of failure tolerated
- Example: Agent mesh for distributed data processing
-
Hierarchical (Tree) - Multi-level supervision with delegation
- Use when: Complex workflows, need clear responsibility hierarchy
- Example: Engineering organization simulation with managers and workers
-
Mesh (Fully Connected) - All agents can communicate with all others
- Use when: Maximum resilience required, communication overhead acceptable
- Example: Consensus-based decision systems
See REFERENCE.md for detailed architecture diagrams.
2. Select Your Framework
Choose the framework that matches your needs:
| Framework | Best For | Complexity | Key Strength |
|---|---|---|---|
| LangGraph | Complex workflows, state management | Medium | Graph-based coordination |
| AutoGen | Conversations, human-in-loop | Low | Easy multi-agent chat |
| CrewAI | Role-based teams | Low | Task delegation |
| Custom | Full control, unique requirements | High | Maximum flexibility |
See KNOWLEDGE.md for detailed comparison.
3. Implement Your First Multi-Agent System
Example: Simple supervisor pattern with LangGraph
from langgraph.graph import StateGraph, END
from typing import TypedDict, Annotated
import operator
class AgentState(TypedDict):
messages: Annotated[list, operator.add]
next_agent: str
# Create specialized agents
def researcher(state):
# Research logic
return {"messages": ["Research complete"], "next_agent": "writer"}
def writer(state):
# Writing logic
return {"messages": ["Report written"], "next_agent": "FINISH"}
# Build graph
workflow = StateGraph(AgentState)
workflow.add_node("researcher", researcher)
workflow.add_node("writer", writer)
workflow.add_edge("researcher", "writer")
workflow.add_edge("writer", END)
workflow.set_entry_point("researcher")
app = workflow.compile()
result = app.invoke({"messages": [], "next_agent": "researcher"})
See EXAMPLES.md for complete working examples.
4. Add Consensus Mechanism (Optional)
For critical decisions, implement voting:
from multi_agent_coordination import ConsensusEngine, Vote, VoteType
consensus = ConsensusEngine(agents=["agent_1", "agent_2", "agent_3"])
votes = [
Vote("agent_1", VoteType.YES, 0.9, "High confidence"),
Vote("agent_2", VoteType.YES, 0.8, "Agree"),
Vote("agent_3", VoteType.NO, 0.7, "Concerns exist"),
]
result = consensus.simple_majority(votes)
# Returns: {"result": "PASS", "yes": 2, "no": 1, "percentage": 66.7}
See PATTERNS.md for full consensus patterns.
Implementation Patterns
This skill provides 6 battle-tested patterns:
Pattern 1: LangGraph Multi-Agent with Supervisor
When to use: Complex workflows with state persistence and conditional routing Complexity: Medium Key features: Graph-based coordination, state management, cycle prevention
View Pattern Details | View Code Example
Pattern 2: AutoGen Multi-Agent Conversation
When to use: Conversational agents, human-in-the-loop, group chat scenarios Complexity: Low Key features: Natural conversation flow, easy human interaction, code execution
View Pattern Details | View Code Example
Pattern 3: CrewAI Role-Based Teams
When to use: Clear role assignments, task delegation, sequential workflows Complexity: Low Key features: Role specialization, task dependencies, built-in tools
View Pattern Details | View Code Example
Pattern 4: Consensus and Voting Mechanisms
When to use: Critical decisions, multiple agent perspectives, conflict resolution Complexity: Medium Key features: Multiple voting types, weighted decisions, quorum support
View Pattern Details | View Code Example
Pattern 5: Shared State with Event Sourcing
When to use: Distributed state, audit trail needed, state replay required Complexity: High Key features: Immutable events, state reconstruction, time-travel debugging
View Pattern Details | View Code Example
Pattern 6: Agent Lifecycle Management
When to use: Dynamic agent pools, health monitoring, auto-scaling needed Complexity: High Key features: Health checks, registration, auto-scaling, metrics
View Pattern Details | View Code Example
Top Gotchas
1. Coordination Overhead
Problem: Too much agent communication slows everything down Solution: Batch communications, use async patterns, minimize chatter Detection: Monitor message count and latency between agents
2. Agent Deadlock
Problem: Agents waiting for each other in circular dependency Solution: Timeout on all waits, detect cycles, use coordinator to break deadlocks Detection: Trace agent state transitions, look for circular waits
3. State Inconsistency
Problem: Agents have different views of shared state Solution: Event sourcing, CRDTs, eventual consistency, versioning Detection: Compare agent state snapshots, look for divergence
View All 10 Gotchas - Detailed troubleshooting guide
Communication Patterns
Synchronous (Request-Response)
Direct agent-to-agent communication with immediate response.
Agent A ──request──► Agent B
Agent A ◄─response── Agent B
Asynchronous (Message Queue)
Fire-and-forget messaging via queue.
Agent A ──msg──► Queue ──msg──► Agent B
Pub/Sub (Broadcast)
One-to-many broadcasting to subscribers.
Publisher ──event──► Topic ──┬──► Subscriber 1
├──► Subscriber 2
└──► Subscriber 3
See REFERENCE.md for detailed patterns.
Best Practices
DO's
- Start Simple - Begin with single agent, add multi-agent only when needed
- Clear Contracts - Define explicit communication protocols between agents
- Timeout Everything - All agent interactions should have timeouts
- Monitor Conversations - Log all agent-to-agent communications
- Use Voting - For critical decisions, use consensus mechanisms
- Specialize Agents - Each agent should have clear, focused responsibility
- Handle Failures - Expect agents to fail, implement graceful degradation
- Version Protocols - Use versioned message formats for compatibility
- Test in Isolation - Test each agent independently before integration
- Implement Observability - Trace multi-agent interactions for debugging
DON'Ts
- Don't Create Agent Explosion - Resist urge to create too many agents
- Don't Share Mutable State - Use message passing, not shared memory
- Don't Ignore Deadlocks - Test for circular dependencies
- Don't Skip Health Checks - Monitor agent health continuously
- Don't Hardcode Routing - Use dynamic agent discovery and routing
- Don't Trust All Agents - Validate agent responses, especially in open systems
- Don't Forget Cleanup - Properly shutdown and cleanup agent resources
- Don't Over-Engineer - Simple coordination often beats complex protocols
Documentation Structure
This skill uses progressive disclosure - start here and drill down as needed:
- KNOWLEDGE.md - Framework deep-dives, theory, research, protocols
- PATTERNS.md - Implementation pattern details, architecture guidance
- EXAMPLES.md - Complete, runnable code examples for all patterns
- GOTCHAS.md - All 10 common pitfalls with detailed solutions
- REFERENCE.md - Architecture diagrams, API reference, feature matrix
Production Deployment Checklist
Before deploying multi-agent systems:
- Define agent roles and responsibilities
- Design communication protocols (sync vs async)
- Implement agent registration and discovery
- Set up health monitoring and alerts
- Configure timeouts for all operations
- Implement consensus mechanisms for critical decisions
- Set up distributed tracing (trace ID propagation)
- Add circuit breakers for agent-to-agent calls
- Implement agent authentication/authorization
- Configure resource limits (CPU, memory, concurrency)
- Set up dead letter queues for failed messages
- Implement agent versioning and compatibility
- Add metrics and dashboards for agent performance
- Test failure scenarios (agent crashes, network partition)
- Document agent handoff protocols
- Implement graceful shutdown procedures
Related Skills
orchestration-coordination-framework- General orchestration patternsevaluation-reporting-framework- Evaluating multi-agent performancemcp-integration-toolkit- Agent communication via MCPai-evaluation-suite- Testing agent interactionsarchitecture-evaluation-framework- System architecture assessment
Quick Reference
Key Concepts
- Agent: Autonomous entity that can perceive, decide, and act
- Coordinator: Agent that orchestrates other agents
- Consensus: Agreement mechanism among multiple agents
- State: Shared or distributed data accessed by agents
- Handoff: Transfer of control from one agent to another
Framework URLs
Common Tasks
- Create supervisor agent: See Example 1
- Implement voting: See Example 4
- Manage shared state: See Example 5
- Auto-scale agents: See Example 6
- Debug agent interactions: See GOTCHAS.md
More by majiayu000
View allAnalyze tasks upfront before execution. Predict task category, identify key files, assess risk level, and detect high-consequence operations. Use proactively when any task description is provided to guide execution strategy.
Use when creating or updating Energize Denver compliance proposals including benchmarking, energy audits, compliance pathways, and performance target analysis. Handles proposal generation from template, cost estimation based on building size and service type, timeline planning, and compliance verification against Denver Article XIV requirements for commercial and multifamily buildings (project)
Use after writing-plans to decompose monolithic plan into individual task files and identify tasks that can run in parallel (up to 2 subagents simultaneously)
CSS styling standards and best practices for responsive, accessible, and maintainable web interfaces with special considerations for multilingual content and Chuukese text display. Use when creating or modifying stylesheets and CSS components.
