Hack #5: Kiro · AWS Kiro
23 Apr, 15:54
🎙️ VoiceGauntlet: Break your voice agent before the public does. 💥 Most teams ship voice agents after only testing the "happy path." Failures usually appear in production when users get angry, adversarial, or manipulative—resulting in broken policies, leaked data, or ignored escalations. 🚨 VoiceGauntlet is a spec-driven red-team harness that solves this. It turns a Kiro requirements.md spec into adversarial voice-agent attack scenarios, pressure-tests an ElevenLabs agent against those exact requirements, and sends the hardening tasks.md back into the same Kiro workflow. 🔄 Spec in. Attack out. Fix back. 🛠️ 💻 How we used Kiro (For Development & The Core Feature): First, we used Kiro for the entire development process of VoiceGauntlet. Every feature started as a Kiro spec (requirements → design → tasks) before a single line of code was written, using Kiro's AI agent to systematically implement our architecture. For the app's core functionality, Kiro isn’t just documentation; it is the source of truth. VoiceGauntlet uses a local MCP bridge to read your project’s actual requirements.md. It parses the acceptance criteria, turns them into ~20 adversarial test callers, and runs the attack. Once a failure is isolated, it generates a structured hardening task and writes it directly back to the Kiro spec folder as tasks.md. 📝 🗣️ How we used ElevenLabs (The Voice Substrate): We built the attack workflow entirely around the ElevenLabs voice-agent stack. VoiceGauntlet uses ElevenLabs Agents, the Simulate Conversation API, and specialized evaluation criteria for requirement-level checking. The underlying live-listen architecture is built around ElevenLabs signed URLs and WebSockets. (Note: For this hackathon demo, the calling stage utilizes a fast simulation mode to keep the visual attack loop tight and easily recordable). ⚡ 🎯 The end-to-end loop: 1️⃣ Product requirements are written in Kiro. 2️⃣ VoiceGauntlet reads that spec via MCP. 3️⃣ It generates hostile callers from the acceptance criteria. 4️⃣ It attacks the ElevenLabs voice agent. 5️⃣ It isolates the highest-risk failure and maps it to a specific requirement. 6️⃣ It generates the exact Kiro-friendly hardening markdown. 7️⃣ The task returns to Kiro as tasks.md. Don’t just test your agent. Pressure-test its requirements. 🛡️
