AI
Your Agent Evals Are Lying to You
Most agent evals measure the clean path. Production readiness depends on the messy path: tools, time, retries, handoffs, stale state, trace evidence, and recovery.
AI
Most agent evals measure the clean path. Production readiness depends on the messy path: tools, time, retries, handoffs, stale state, trace evidence, and recovery.
Issue #16
If orchestration decides sequence, identity decides legitimacy: what an agent can do, for whom, under what authority, across which tenant boundary, and how operators recover when that authority breaks.
Issue #12
A2A turns agent-to-agent communication into a distributed-systems problem, with identity, task ownership, retries, trust, and failure handling now sitting on the critical path.