Hack #5: Kiro · AWS Kiro
23 Apr, 11:05
Vaidya — Ambient Clinical Scribe for Indian Doctors Indian solo practitioners see 30-40 patients daily. Most write notes on paper or not at all — clinical documentation is the first casualty of a packed waiting room. Western ambient scribes don't handle Hindi-English code-switching, and enterprise pricing is out of reach for small clinics. Vaidya listens to a doctor-patient conversation in Hindi/Hinglish, produces a structured English SOAP note (Subjective, Objective, Assessment, Plan), and generates a patient-friendly Hindi summary — all automatically. **How it works:** 1. Doctor starts a visit → browser captures audio via MediaRecorder 2. Live transcript appears in real-time during the conversation (Scribe v2 Realtime) 3. After "End Visit," the full audio is processed through Scribe v2 Batch with speaker diarization and 196 curated Indian medical keyterms (drug brands, Hindi symptom phrases, Ayurvedic terms) 4. The diarized transcript feeds Google Gemini to generate a structured SOAP note in English 5. A second Gemini call produces an 80-200 word patient-friendly Hindi summary 6. The doctor reviews, edits, and signs the note. The Hindi summary plays aloud via ElevenLabs TTS (Eleven v3) 7. On the patient detail page, a voice assistant (ElevenLabs Agents) lets the doctor ask questions about any patient's history by voice — in Hindi **ElevenLabs integration (5 products):** - **Scribe v2 Batch** — core transcription with 32-speaker diarization, Hindi/English code-switching, and keyterm prompting for medical vocabulary - **Scribe v2 Realtime** — live transcript preview during recording via `@elevenlabs/react` useScribe hook - **TTS Eleven v3** — Hindi patient summary narration with warm, natural voice - **Conversational AI Agents** — voice assistant on patient detail page using React SDK (ConversationProvider + useConversationClientTool for patient context) - **ElevenLabs UI** — 5 components: LiveWaveform (recording visualization), AudioPlayer (visit playback), MicSelector (microphone selection), ShimmeringText (animated branding), Orb (agent speaking/listening state) **Kiro IDE usage:** - 2 specs with full requirements → design → tasks workflow (ambient-scribe-pipeline: 15 requirements, 12 correctness properties; patient-voice-assistant: 4 requirements) - 4 steering docs (externalized LLM prompts for SOAP generation and Hindi summary, medical writing conventions, consent/privacy guidelines) - 2 hooks (typecheck-on-save, test-after-task) - ElevenLabs Power for guided API integration - 75 unit tests, property-based testing with fast-check **Tech stack:** Next.js 15 (App Router), shadcn/ui, SQLite via Drizzle ORM, Google Gemini via Vercel AI SDK, TypeScript throughout. **Pipeline performance:** 18.6 seconds end-to-end (3s transcription + 2.4s SOAP generation + 13.5s Hindi summary) for a 10-minute visit recording.
