

Challenge
Build something with turbopuffer and ElevenLabs APIs
Prizes
$15,292 total1st Place
$9,182$8,192 in turbopuffer credits + LEGO kit
3 months ElevenLabs Scale ($990)
2nd Place
$4,756$4,096 in turbopuffer credits
2 months ElevenLabs Scale ($660)
3rd Place
$1,354$1,024 in turbopuffer credits
1 month ElevenLabs Scale ($330)
Build something creative using turbopuffer's vector search and ElevenLabs APIs, then submit a high-quality viral-style video demonstrating what you've built.
turbopuffer is the search engine to connect large amounts of unstructured data to AI. Built from first principles on object storage with an intelligent cache layer, turbopuffer is just as fast as in-memory search engines but orders of magnitude cheaper to run — indexing over 3 trillion documents for leading AI companies.
ElevenLabs offers state-of-the-art audio AI. The Sound Effects API generates any sound effect from a text prompt. The Music API creates original music tracks from descriptions. Combine turbopuffer's semantic search with ElevenLabs' audio generation to build something unique.
We're particularly excited about music and sound effects use-cases. Show us creative combinations of vector search and audio AI — recommendations, RAG for audio, or anything that benefits from connecting unstructured data to ElevenLabs' generation capabilities.
Sign up on the turbopuffer event page to get $128 in credits for your first month while you prototype. Already a turbopuffer customer? Email hacks@turbopuffer.com to request your hackathon credits.
When posting your submission on social media, tag @turbopuffer and @elevenlabsio and use the hashtag #ElevenHacks.
Attendee offers
$128 in turbopuffer credits
Sign up on the event page for $128 in credits for your first month
Sign in to claim this offer
1 month ElevenLabs Creator
Free month of ElevenLabs Creator plan for all attendees
Sign in to claim this offer
16 Apr, 12:36
This week I build SoundForge! SoundForge is a tool to help game developers manage sounds for their characters and maps by generating unique voices, sound effects and music with Elevenlabs. So many new game developers get overwhelmed with all the aspects of game development and SoundForge is here to make that a little easier. SoundForge also takes advantage of TurboPuffer's vector database to map out all the relationships between characters, maps and even sounds so game developers can make sure the store and lore stays in line.

15 Apr, 21:42
StreamReact is a fully autonomous AI audio-visual producer that solves the painful multi-tasking burnout for solo streamers by instantly triggering perfectly-timed overlay reactions. It achieves zero-latency performance by using Turbopuffer's Native ANN vector search to map live voice transcripts against historical context, while leveraging the ElevenLabs Sound Effects API to dynamically generate and learn custom audio assets on the fly.

16 Apr, 15:58
92% of brands have a look. Only 8% have a sound. audentity changes that — generating a complete sonic identity for any brand in under 60 seconds, just from a URL. It uses turbopuffer's semantic vector search to match a brand's personality against a curated library of 30 sound archetypes, then ElevenLabs to generate a custom logo sound and brand theme on the spot. No studio, no budget, no waiting.

16 Apr, 07:35
Jungle Safari 🌿 - is an audio adventure for toddlers (ages 1–3). A kid hears a real animal sound, taps the right animal, and a warm AI mascot voice responds with a tailored fun fact — different every time, based on exactly what they picked. ElevenLabs does double duty: Sound Effects for every animal cry, Text-to-Speech ("Bella" voice) for the mascot. Each mascot reply is generated fresh by GPT-4o-mini reacting to the specific wrong guess vs the right answer, then spoken aloud. No pre-recorded lines. turbopuffer indexes every animal as a 1536-dim OpenAI embedding with metadata, making the library semantically searchable for themed expeditions — RAG for audio. Upstash Redis caches every mascot response as base64 audio so no voice reply is ever computed twice for any user worldwide. Live: thejunglesafari.netlify.app · github.com/anirxdh/JungleSafari

15 Apr, 16:05
Objects That Sing is an AI-powered music app that turns your surroundings into a personalized motivational song. What it does — the full flow: You photograph anything around you — your desk, your kitchen, your gym, your bedroom. Any photo works. AI scans the photo and detects 3–5 objects — using Groq's Llama 4 vision model. It picks the most interesting, character-rich objects (not background noise). Each object gets a unique personality — fetched from Turbopuffer, a vector database seeded with personality profiles for hundreds of objects. A coffee mug has different energy than a dumbbell. You set the vibe — four choices: Your goal right now (free text — "I need to stop procrastinating", "I need to call my mom") Music genre (Hip Hop, R&B, Pop, Rock, Gospel, Lo-fi, Bollywood, K-pop, Electronic, Jazz, Reggaeton, Country) Intensity (Gentle Nudge / Medium Fire / Full Beast Mode) Language (English, Hindi, Tamil, Telugu, Spanish, French, Portuguese, Arabic, Korean, Japanese) Groq writes the lyrics — using Llama 3.3 70B. Each object gets a verse where it speaks directly to you in first person, mid-conversation, with attitude. The object has a memory of your habits, an opinion about you specifically, and ends every verse with one direct demand. The chorus is all objects together at maximum energy. ElevenLabs sings the entire song — a full 90-second sung song with real vocals generated via the /v1/music API. Not text-to-speech — actual music with singing. You listen in the Player — with your photo as cover art, scrolling lyrics synced to playback, genre badge, object tags, share button, and a download option. Library saves all your songs — everything you generate is saved to your profile for replay anytime. Tech stack: Layer What's used Frontend React + Vite, Tailwind CSS, Framer Motion Backend Express (Node.js) AI Vision Groq — Llama 4 Scout 17B Lyrics Groq — Llama 3.3 70B Versatile Music ElevenLabs /v1/music (sung vocals) Personalities Turbopuffer vector database Database Replit PostgreSQL (via Drizzle ORM) Sessions Cookie-based (no login required) Storage Persistent file storage for photos + MP3s Design: Mobile-first — works perfectly on any phone browser Desktop — shows the app inside a realistic iPhone 15 Pro frame with an animated dark space background (aurora blobs, floating music particles, glassmorphism side panel) iOS light-mode inside the phone — white cards, iOS blue accents, real shadows, native feel What makes it different: The objects aren't motivational speakers. They're characters with attitude. A cable machine that's furious you walked past it. A carrot that's personally offended you keep choosing pizza. A notebook that remembers you haven't opened it in three days. The song is written specifically for you, about your specific goal, using only the objects actually in your photo — nothing generic.

13 Apr, 18:21
Every musician has a tune in mind. What if you could search and create music by feel, play a song to create music Shazam-style, fuse genres into entirely new sounds, and transform lyrics into fully composed songs? Meet GrooveForge — THE ULTIMATE AI TOOLKIT FOR ORIGINAL MUSIC CREATION. GrooveForge empowers musicians to create in seconds through four powerful modes: Vibe Graph enables users to search and generate music by mood, emotion, and audio characteristics; Sound Match allows you to play a song while GrooveForge extracts its sonic fingerprint to generate something completely original in the same feel — Shazam, but for creation; Text-to-Music lets you type anything, from a mood or genre mashup to an era or even "Metallica meets Taylor Swift," as GrooveForge searches millions of blueprints to find the closest musical DNA and forges an original track grounded in real structure; and Lyrics-to-Music transforms your words into fully composed songs by analyzing emotion, themes, and rhythm to create music where every section fits your lyrics — poetry to production. At its core, GrooveForge leverages millions of indexed songs and audio blueprints enriched with features that define a track’s DNA. Built using datasets including Last.fm, Free Music Archive, the Million Song Dataset, and MusicCaps, it retrieves and analyzes the closest matches to generate original compositions grounded in real musical structure, ensuring precision, originality, and creative control. Powered by ElevenLabs for music generation, turbopuffer for lightning-fast vector retrieval, and Google Gemini for multimodal intelligence. Search by Vibe. Generate by Blueprint. Forge Your Sound. 🚀

16 Apr, 15:04
Have a new product? Let Stan Bran the product marketing music man take care of business for you. Using eleven labs and turbopuffer, simply tell Stan about your product and he does the rest. Don't have enough budget? Stan will let you use his free library, and ever searches his deep database for you.

15 Apr, 10:56
History is preserved in text—but its sound is lost. Chronoscapes reconstructs historical audio scenes by turning archival documents into music and soundscapes grounded in real evidence. Given an era, event, or location, it retrieves context-rich, sound-relevant passages—news reports, eyewitness accounts, and cultural fragments—using Turbopuffer for fast semantic search across large-scale archives. A sample of the 340GB American Stories collection, comprising millions of digitized historical documents, was used to ground the system in authentic historical material. Chronoscapes then analyzes this content to extract latent sonic signals such as mood, tempo, instrumentation, and environmental texture. These signals are structured and transformed into immersive audio using ElevenLabs, generating both period-appropriate music and environmental sound. For deeper immersion, Chrono Radio turns any historical theme into a continuous broadcast: a living station where AI-voiced DJs introduce each track with context drawn from the archive, weaving music and narration into an unbroken journey through time. A query like “letters home from soldiers in the 1940s and the radio swing music that kept them going” becomes an hour of era-faithful sound, paced and hosted as if heard on the original airwaves. Instead of imagining the past, Chronoscapes makes it audible—bridging archival data and generative audio to bring history back to life.

16 Apr, 15:58
SonicMemoir turns personal memories into cinematic audio scenes. You type a moment like “the night we danced in the rain in Paris, 2019” and it generates an emotional soundtrack plus timed sound effects that make the memory feel alive. The app builds a personal “Life Album” over time, carrying forward recurring sonic motifs across memories. We used ElevenLabs for music and sound generation, and Turbopuffer for memory archetype retrieval and private memory indexing to personalize future generations.

16 Apr, 15:34
SoundCharades is an AI-powered audio guessing game where every sound effect and original song is generated from scratch by ElevenLabs. Players listen to layered soundscapes and AI-composed riddle songs to guess movies, video games, countries, food, and more. The game's difficulty engine is built entirely on turbopuffer, using cosine similarity across 768-dimensional embeddings to dynamically select answer options based on semantic proximity. The closer the concepts are in vector space, the harder it is to tell them apart. Players can also generate custom quizzes on any topic in minutes, with new concepts instantly queryable in turbopuffer the moment they're created. No manual curation, no static databases, just vectors, sound, and AI.

16 Apr, 14:45
LinguaQuest — Learn Any Language by Solving a Mystery You're not a student. You're an investigator dropped into an ancient city at dusk, and the only way to uncover the hidden secret of its bazaar is to speak the language of the locals. LinguaQuest is a mobile-first language RPG built around a single core insight: the best way to acquire vocabulary is to hear it first — no reading, no grammar drills. Just sound and meaning, the way children actually learn. How it uses ElevenLabs: Every vocabulary word is spoken by a native-quality voice generated on demand via the ElevenLabs TTS API. This isn't pre-recorded audio — it's dynamic voice generation that can scale to any language, any word set. The entire dual-channel mechanic (hear the word → match the image) only works because the audio quality is indistinguishable from a real speaker. How it uses turbopuffer: The full vocabulary is indexed as vector embeddings in turbopuffer. When the game selects the next quest challenge, it runs semantic search against what the player has already learned — surfacing vocabulary that's adjacent in meaning, reinforcing retention through contextual proximity rather than random drill order. turbopuffer also powers the mystery hint system: story clues are retrieved by semantic similarity to the player's current vocabulary state. How it uses Claude (via AWS Bedrock): The narrative layer adapts in real time. If a player hasn't learned a word yet, Claude generates a story hint that routes around it. As vocabulary grows, the mystery deepens. Tech: Next.js · TypeScript · ElevenLabs TTS API · turbopuffer · Claude via AWS Bedrock · Tailwind CSS Design philosophy: Every screen is a scene. Words are collectibles, not homework. The UI disappears into the story. → Repo: github.com/levanstein/LinguaQuest

16 Apr, 14:11
Dreamwave🌙 is an AI app that reveals hidden patterns in your dreams — and turns them into sound. People record dreams all the time, but it’s almost impossible to see the deeper patterns across weeks or months of messy, unstructured thoughts. Dreamwave solves that by using turbopuffer’s vector search to connect and cluster dream fragments — identifying recurring symbols, places, emotions, and subconscious “entities” that the user might never consciously notice. Then, using ElevenLabs’ audio generation APIs, Dreamwave translates those patterns into a personalized sonic experience - turning abstract subconscious behavior into something you can actually hear. The result isn’t just music - it’s a representation of your internal patterns over time. It transforms dream journaling from passive logging into something interactive, emotional, and deeply personal.

15 Apr, 15:13
Most code stays silent. But what if you could talk to it? That is the idea behind ElevenNotch: a lightweight app that lives right in your MacBook’s notch and gives you an AI coding partner in seconds. It understands your codebase, helps you brainstorm new features, think through refactors, and untangle confusing parts of a project. Working with code snippets is often tedious and error-prone. Copying and pasting rarely captures the full context, and important dependencies are easy to miss. ElevenNotch removes that friction. Just ask about any part of your code, and the assistant responds with clear, concise answers grounded in the bigger picture of your project. ElevenNotch is powered by two core technologies working in tandem. Your codebase is indexed using Mistral's Codestral Embed model — a state-of-the-art embedding model — and stored in TurboPuffer, a serverless vector database. When you ask a question, TurboPuffer performs a semantic vector search across your entire project in milliseconds, retrieving the most relevant code chunks regardless of how you phrased the question. This means asking "how does this part handle errors?" finds the right code even if the word "error" never appears in it. Retrieved chunks are then passed as context to a language model, which synthesizes a focused, grounded answer — one that reflects the actual shape of your project rather than generic assumptions about how the code might work. The answer is then delivered entirely through ElevenLabs' Text-to-Speech API, so the response comes back as natural, spoken audio rather than text on a screen. The result is a truly conversational experience: you speak a question, your codebase is searched, and a voice answers you back — without ever leaving your flow state or touching your keyboard. Instead of wrestling with isolated snippets, you get a fast, conversational way to explore and understand your codebase.

14 Apr, 19:28
Timeline Manipulator is a conspiracy corkboard that traces the ripple effects of world events. Pick an event from the carousel — or type your own — and watch as a sarcastic AI narrator walks you through the chain of consequences over custom noir jazz, from global catastrophe down to your morning coffee costing more. Turbopuffer powers two things: BM25 full-text search retrieves historical parallels from seeded world events for richer analysis, and semantic deduplication of custom events so repeat queries serve cached results instantly instead of burning API calls. ElevenLabs has five integration points: Text-to-Speech with a custom voice clone narrates every ripple. The Music API generates unique noir jazz per event, mood-matched to the crisis. The Sound Effects API creates domain-specific ambient audio per consequence card. All generated live for custom events, pre-cached for the 20 built-in scenarios. Also powered by Claude API for ripple-effect analysis with model fallback, React + Vite frontend, and Netlify Functions backend. 20 pre-cached events with full audio, live custom event analysis, branching timeline choices with glitch transitions, and a CASE CLOSED stamp when the investigation ends. Built by ReddX Industries from the Philippines.

13 Apr, 21:01
SoundDropLabs is an AI sound design tool that turns any text description into production-ready audio. Two modes: SFX Mode generates 4 unique sound effect variations in parallel. Scene Mode takes a scene description and outputs a full 4-layer DAW-style mix (ambience, foreground, background, music) in under 10 seconds. The core is a 4-stage RAG pipeline. Every generation embeds the user's query via HuggingFace, runs semantic search across 26,264 indexed Freesound samples in turbopuffer (~20ms, cosine distance), feeds the 8 closest acoustic neighbors into Gemini 2.0 Flash for prompt enrichment, then hits ElevenLabs SFX API x4 in parallel. The turbopuffer layer is what makes the generations actually sound grounded. Without it, ElevenLabs gets a vague prompt. With it, the model gets a vivid acoustic description built from real-world reference sounds. Scene Mode runs 4 completely independent pipelines simultaneously via Promise.allSettled. The music layer uses ElevenLabs Music API for a 30s instrumental. The other 3 layers use SFX API. Progress streams live to the browser via SSE so users watch each stage complete in real time. Full pipeline: ~5-6 seconds for SFX, ~8-10 seconds for a full scene. Live demo: https://v0-soundroplabs.vercel.app

12 Apr, 08:32
1. What did you build? I built AetherVoice, a hardware-anchored AI agentic bridge that combines biometric security with personalized voice AI. It allows users to interact with their digital twin or autonomous agent using only their voice, secured by a hardware-level root of trust. By leveraging the AetherUX protocol, we’ve created a system where an AI agent only "wakes up" and speaks when the user's hardware identity (Secure Enclave) is biometrically verified. 2. What problem does it solve? Current AI voice agents face two massive hurdles: Identity Fraud (Deepfakes) and Context Fragmentation. Security: Most voice bots have no way to verify the speaker is actually the authorized owner. AetherVoice solves this by requiring a hardware-anchored secp256r1 Passkey handshake before the agent vocalizes or accesses private data. Onboarding Friction: We eliminate the need for seed phrases and manual logins. Your voice and your biometric hardware are your identity. Memory Persistence: Traditional agents forget context. We use a high-performance vector memory to ensure agents have long-term, secure recall of past interactions. 3. How does it use ElevenLabs and turbopuffer? ElevenLabs: Powers the agent's vocal identity. We use ElevenLabs' high-fidelity voice design to create a persistent, recognizable "Vocal Fingerprint" for the agent. This voice is biometrically locked—it will only speak authorized responses once the AetherUX handshake is successful, preventing unauthorized audio generation. turbopuffer: Serves as the agent's Sovereign Memory. We use turbopuffer’s serverless vector search to store and retrieve millions of historical conversation fragments and user preferences. Because turbopuffer is 10-100x more cost-effective than traditional vector DBs, we can provide each user with a dedicated, massive "Memory Namespace" that allows for near-instant, semantic recall (RAG) during live voice conversations, all while maintaining sub-10ms response times for a natural "Invisible" UI experience.

16 Apr, 15:55
Resonar is a social platform where humans share raw audio stories and AI connects strangers through what they feel. People record 30-second to 3-minute voice notes about their real lives — the mundane, the vulnerable, the funny, the heavy. AI does the rest: ElevenLabs Speech-to-Text transcribes each story, Gemini extracts the emotional core and writes custom Music + SFX prompts tailored to that exact story, ElevenLabs Music API and Sound Effects API generate an original atmospheric layer that sits underneath the voice, and ElevenLabs Text-to-Speech narrates the Daily Resonance — a podcast episode assembled from the day's collective mood. Built solo in 72 hours using Next.js, turbopuffer (aws-ap-south-1), all 5 ElevenLabs APIs (STT, TTS, Music, SFX, Speech-to-Text batch), Gemini Flash 2.0, and Cloudflare R2.

16 Apr, 15:53
Creating anime music videos (AMVs) traditionally requires hours of manual editing — watching footage, finding the right scene for each lyric, and cutting on beat. This project automates the process: ElevenLabs generates the song with word-level timestamps from a text prompt, while turbopuffer's hybrid search (vector ANN + BM25) instantly finds the best matching anime clip for every beat from pre-indexed scenes. The result is a fully beat-synced AMV generated. Demo video: https://youtu.be/k597okobXsc Slides: https://tantk.github.io/amv-search/

16 Apr, 15:09
DocSurfer is an assistant built to turn documents into interactive, visual experiences. You can upload your own files and query them in natural language using a turbopuffer powered vector search, ensuring answers are fast, context-aware, and grounded directly in your content with cited passages for transparency. What sets DocSurfer apart is that it goes beyond traditional chat. From any response, you can generate an “Experience” - transforming the information into a comic-style learning with visuals, narration, sound effects, and background music, all powered by ElevenLabs for an immersive experience.

16 Apr, 05:01
Project: AURA Protocol (MoodSync Jukebox) I built AURA Protocol, a real-time, room-based social music platform that uses biometric facial detection to autonomously curate a shared soundtrack. Using a "Room Code" system (similar to Kahoot), users join a synchronized session where their webcams act as sensors. The system continuously analyzes facial telemetry to detect the collective mood of the room and instantly plays music that matches that specific energy. What problem does it solve? Shared listening experiences often suffer from the "paradox of choice" or a disconnect between the music and the actual vibe of the people in the room. Manually skipping tracks or searching for songs kills the social momentum. AURA solves this by removing the UI friction entirely—the music evolves naturally based on subconscious emotional feedback (facial expressions), ensuring the "vibe" is always in sync without anyone needing to touch a button. How does it use ElevenLabs? ElevenLabs serves as the "AI DJ" and the narrative soul of the application. Once a mood is detected and a song is selected, we use the ElevenLabs API to generate a high-fidelity, context-aware voiceover. This AI DJ doesn't just announce the song; it comments on the room’s detected state (e.g., "I see the energy is picking up! Let's keep this momentum going with..."). This creates a highly immersive and "living" atmosphere that feels responsive and human. How does it use turbopuffer? turbopuffer is the high-performance semantic engine that makes the "Sync" possible. We store a vast library of song metadata and emotional embeddings within turbopuffer. When facial detection outputs a mood vector, we perform an ultra-low latency vector search in turbopuffer to find the most semantically relevant track. Its incredible speed allows the Jukebox to transition and react to mood changes in milliseconds, providing a seamless experience that would be impossible with traditional keyword-based databases.

12 Apr, 03:57
Turbopodcast is a podcast sound design tool 🎙 Upload your recording → auto-transcribe → AI detects sound moments → semantic search finds the right effect → generate what's missing on the spot The library grows with every generation.

16 Apr, 14:01
SoundPost turns your journal entry into a personalized AI-generated soundtrack — one card a day, so you never lose how a day felt. Most days dissolve by bedtime. SoundPost fixes that: type a few lines about your day, and ElevenLabs' music generation API produces a unique track that captures the mood. turbopuffer vector search on your journal embeddings then surfaces the days that "felt just like this" — so patterns in your life become visible over time. Tech stack: ElevenLabs — music generation API to synthesize the personalized daily soundtrack from extracted mood tags and key phrases turbopuffer — vector database for semantic similarity search across journal history

16 Apr, 12:17
Underscore turns a filmmaker's creative corpus into a grounded film score — no generic stock music, no guesswork. Upload your scripts, director's notes, subtitles, and moodboards. Every document is parsed, chunked, and embedded using Google's Gemini gemini-embedding-001 (768-dimensional vectors). Both prose chunks and sonic signature cross-embeddings are indexed into Turbopuffer — a serverless vector database — across two per-project namespaces. When you describe a scene, Underscore runs 4 parallel retrieval queries against Turbopuffer: cosine vector search, BM25 full-text search, director-notes-filtered vector search, and a sonic namespace query. Results are fused using Reciprocal Rank Fusion (RRF) to surface the most relevant evidence across all query types. Claude (Anthropic) reads the top retrieved chunks and synthesizes a cue brief — mood, tempo, instrumentation, key themes — grounded entirely in your own materials. It also outputs 3 music prompts (fast burst, cinematic, voice-weighted) and 2 SFX descriptions tuned to complement the scene. ElevenLabs Music (composeDetailed) generates all 3 score variants and a separate 120-second title track cue in parallel. ElevenLabsSound Effects (textToSoundEffects) converts Claude's SFX descriptions into physical, environment-matched audio clips. All generated audio is stored in Vercel Blob (private access) and served via a proxy route. Auto scene extraction uses Claude to identify 3 dramatic moments from the indexed corpus, pre-filling the score generation workflow so filmmakers can start generating immediately after upload. The result: music and sound that actually know your film. PS - The docs/ folder contains a ready-made corpus for the short film The River — use it to test the full pipeline immediately.

16 Apr, 11:43
Network Briefing is a feature inside Enlighten Copilot, an AI-powered tool for searching and briefing across your professional network, built on Turbopuffer and ElevenLabs. Turbopuffer powers semantic search across embedded chunks of your network's connection data, returning the right people for any query: hiring, travel, and meeting prep. ElevenLabs then generates a podcast-style audio briefing with narration, music, and sound effects composed live in the browser, where the music prompt is dynamically derived from the search results (industry, seniority, result density) so every search sounds different. What surprised us most: it's genuinely useful in a B2B context. Podcast-style briefing felt great for connections research, but we immediately got requests for detailed single-profile lookups and for people to link to their calendars to get professional pre-meeting audio on their way to their meetings.

16 Apr, 08:05
Séance reconstructs the ambient soundscape of any place in history. Type a location and year, it generates what that moment actually sounded like. History has books for words and paintings for visuals. Nobody captured sound. Séance fills that gap. It uses ElevenLabs Sound Effects API to generate three layered audio tracks (ambient bed, human activity, environmental texture) in parallel. Gemini extracts historical sensory evidence to ground the prompts in reality. Artifacts are stored in Turbopuffer so repeat queries load instantly from the archive, with audio overflow on Cloudflare R2.

16 Apr, 07:29
WikiSounds: Paste any text. Get its unique song. What does a job rejection letter sound like? A Reddit argument? Your company's about page? WikiSounds finds out. We indexed 1,100+ music genre descriptions from Wikipedia's entire documented history of human musical expression into turbopuffer. Paste anything, and we semantically match it against that curated vocabulary... post-punk, Tuvan throat singing, vaporwave, zydeco, everything humans have ever named and described... then feed the matched genres into ElevenLabs to generate a custom 30-second track. The key insight: without the retrieval layer, you're just free-prompting a music generator with raw text. With turbopuffer in the middle, you're grounding generation in real musicological knowledge. Your Monday standup notes don't just become "office music," they become a specific blend of jumpstyle and juke, because that's what the semantic space says your words actually sound like. How it works: Paste anything: a poem, a Slack thread, a recipe, your inner monologue Your text is embedded and matched against Wikipedia's music genre taxonomy via turbopuffer Matched genres plus a summary of your text become a rich music generation prompt ElevenLabs generates a unique 30-second track Share it with a permanent link Stack: React + Cloudflare Workers + turbopuffer + OpenAI embeddings + ElevenLabs Music API

15 Apr, 16:37
Echoverse is a web-based interactive AI audio narrative engine. Users simply input a story premise to receive an immersive audio experience including narration, sound effects, and background music, with real-time choices driving the plot. The project uses ElevenLabs as the "voice of the world"—the TTS API generates narration, the Sound Effects API generates scene sound effects, and the Music API generates adaptive background music. These three layers of audio are mixed and played in real-time via the Web Audio API. Turbopuffer is used as the "memory of the world"—storing world elements, player decisions, and profiles in vector form for RAG retrieval to drive narrative generation. It also performs semantic vectorization caching of generated sound effects and background music, allowing them to be reused directly when the similarity between a new request and an existing asset exceeds a threshold, without repeatedly calling the generation API. This semantic caching mechanism is the project's core innovation: it creates a cost flywheel between ElevenLabs and turbopuffer—the more stories, the richer the cache, the fewer API calls, the faster the response, and the lower the cost. At the end of a single story, the cache hit rate can climb from approximately 10% initially to 40-50%. All user data and API keys are stored locally in the browser (localStorage + IndexedDB), with zero server-side persistence, prioritizing privacy.

14 Apr, 19:36
Anymusic is a shared “mood radio”: people submit a short feeling in text, the app generates a ~22s audio clip and adds it to one queue everyone hears. ElevenLabs turns that composed music prompt into sound (and optional speech-to-text on /api/transcribe). Turbopuffer stores vectors for emotional archetypes (to pick prompts) and for each clip’s feeling (to order the queue so similar moods sit near each other).
16 Apr, 18:05
If you are a solo streamer right now, your ability to compete with huge esports studio productions is effectively broken. Live streaming is an exhausting, multi-tasking nightmare. Solo creators are burning out trying to be the on-screen talent and the behind-the-scenes production crew. And because they can't afford massive production budgets, they get buried by the algorithm. StreamReact is the solution. By fully bridging unstructured voice data with real-time generative audio, it gives every solo creator their own autonomous AI production team. Most of the automated streaming tools out there suffer from lag that feels "sus". It is incredibly embarrassing to manage a complex production board while engaging with a live audience. StreamReact fixes this.

16 Apr, 14:46
SonicReal is an AI engine that turns your photos or text into immersive, 3D soundscapes. Imagine looking at a picture of a rainy forest and instantly hearing the rain hitting leaves, distant birds, and soft, moody music. Using Gemini to "see" the scene and turbopuffer to search millions of real-world sound patterns, the app directs ElevenLabs to build a layered, cinematic audio experience. It doesn't just play a sound; it orchestrates a reality. Use Case: It’s perfect for game developers and content creators who need instant, high-quality audio that perfectly matches their visuals without hours of manual editing.

16 Apr, 12:20
AI that listens to your video or script and adds the perfect sound effects automatically
16 Apr, 05:19
We built notbumblebee: a lo-fi track generator that automatically finds the perfect movie dialogue and composes a complete track around it. Type a vibe like "forest coffee" or "midnight drive," and get a full lo-fi beat with a movie quote woven into the arrangement. Lo-fi producers spend hours scrubbing through movie clips hunting for that one perfect soundbite to sample. notbumblebee does the digging by searching 10,000 movie clips from the VoxMovies dataset instantly and generating an original track that fits the quote. turbopuffer powers the retrieval. We use `multi_query` to run BM25 keyword matching and vector ANN over OpenAI text-embedding-3-large embeddings (3072 dimensions) in a single call, then fuse the rankings with Reciprocal Rank Fusion. This finds the right clip from 10,000 candidates in milliseconds — matching both the literal words and the semantic mood of what the user typed. ElevenLabs powers both the music and the data pipeline. Scribe v2 transcribes all 10K movie clips with word-level timestamps, giving us precise dialogue trimming and placement. For music generation, we call `/v1/music/plan` to generate a structured composition plan, then Claude edits it — adding a specific key, BPM, chord progression, and instruments for each section. The edited plan goes to `/v1/music/compose` to produce the full track. Each generation has a real musical arc: atmospheric intro with the dialogue, groove, peak, and outro.

16 Apr, 00:33
AI-powered audio engine using turbopuffer vector search to semantically map natural language to rich audio landscapes. Inputs are embedded into a high-dimensional mood space, matched via cosine similarity, then synthesized through ElevenLabs' generative audio pipeline into immersive, context-aware soundscapes in real-time.
