Hack #5: Kiro · AWS Kiro
23 Apr, 14:00
VoiceBridge is a desktop app that translates your voice in real time and outputs it through a virtual microphone, so any meeting app (Zoom, Meet, Teams, Discord) hears you speaking the other person's language, in your own cloned voice. How it works: Your microphone captures your speech ElevenLabs Scribe v2 Realtime transcribes it in 150ms An LLM translates the transcript token-by-token (300ms) ElevenLabs Multilingual v2 TTS speaks the translation in your cloned voice (75ms) Audio outputs through a virtual microphone, the meeting app picks it up automatically Total latency: under 1.5 seconds end-to-end. 90+ languages. The other participants don't install anything. ElevenLabs APIs used: Speech-to-Text (Scribe v2 Realtime) — real-time WebSocket transcription with manual commit strategy for push-to-talk Text-to-Speech (Multilingual v2) — voice-cloned speech synthesis with speaker boost for consistent volume Voice Cloning (Instant Voice Clone) — 30-second recording creates a voice profile that persists across sessions Key features: Push-to-talk with animated listening indicator Voice clone management — create, switch, delete multiple voice profiles Works on macOS, Windows, and Linux BYO keys, your API keys are AES-256 encrypted, stored locally, never sent to any server except the API providers Nothing design system UI, OLED black, Space Mono, mechanical toggles Built with Kiro's spec-driven development (requirements → design → tasks) Tech stack: Electron, Preact, TypeScript, ffmpeg, BlackHole (macOS virtual audio driver) GitHub: github.com/AlleyBo55/VoiceBridge
