Submission by Raj

Hack #4: turbopuffer · turbopuffer

13 Apr, 21:01

SoundDropLabs is an AI sound design tool that turns any text description into production-ready audio. Two modes: SFX Mode generates 4 unique sound effect variations in parallel. Scene Mode takes a scene description and outputs a full 4-layer DAW-style mix (ambience, foreground, background, music) in under 10 seconds. The core is a 4-stage RAG pipeline. Every generation embeds the user's query via HuggingFace, runs semantic search across 26,264 indexed Freesound samples in turbopuffer (~20ms, cosine distance), feeds the 8 closest acoustic neighbors into Gemini 2.0 Flash for prompt enrichment, then hits ElevenLabs SFX API x4 in parallel. The turbopuffer layer is what makes the generations actually sound grounded. Without it, ElevenLabs gets a vague prompt. With it, the model gets a vivid acoustic description built from real-world reference sounds. Scene Mode runs 4 completely independent pipelines simultaneously via Promise.allSettled. The music layer uses ElevenLabs Music API for a 30s instrumental. The other 3 layers use SFX API. Progress streams live to the browser via SSE so users watch each stage complete in real time. Full pipeline: ~5-6 seconds for SFX, ~8-10 seconds for a full scene. Live demo: https://v0-soundroplabs.vercel.app

Repo Demo

X LinkedIn Instagram TikTok

4 participants4 audience