Ammon

1,250 points · 8 submissions

Submissions

with Cursor

+150

Voisurf reimagines the web as a fully voice-controlled experience. Instead of clicking, typing, tab switching, and fighting tiny UI elements, you simply talk. “Open Hacker News.” “Search for apartments in Brooklyn.” “Summarize this page.” “Play lo-fi beats.” The browser becomes conversational. Built with Cursor and ElevenLabs, Voisurf combines real-time voice interaction with browser automation to create a hands-free web designed for speed, accessibility, and flow. The goal wasn’t just to bolt voice commands onto Chrome — it was to make browsing feel natural, fluid, and futuristic. Cursor accelerated rapid prototyping and iteration, while ElevenLabs powered the ultra-responsive conversational voice layer that makes the experience feel alive. The result feels less like using a browser and more like talking to the internet itself.

Repo Demo

X Instagram TikTok

Submitted 14 May 2026

Hack #7: v0

with v0

+150

CribSwipe takes Craigslist, the most aggressively ugly apartment listing site on the internet, and reimagines it as a Tinder-style swipe experience for NYC apartment hunting. Craigslist hasn't meaningfully changed its UI since the 90s. Blue links, dense text, no images on the index, a layout that punishes you for wanting somewhere to live. CribSwipe keeps the exact same content (NYC apartment listings) and rebuilds the entire interaction model from scratch. The redesign: The text-wall index becomes a full-bleed swipe deck: photos, price, neighborhood, key amenities at a glance Filtering happens through voice conversation instead of dropdowns and checkboxes The whole UI was built in v0 The audio layer: Meet Vinny, a fast-talking virtual realtor powered by ElevenLabs Conversational AI. Tell Vinny what you want ("two bedroom under 4k in Brooklyn, no walkups, dogs okay") and he filters the deck for you. He has opinions. He will roast your budget. He sounds like he's been showing apartments in Bay Ridge since 1994. The conversation drives the filters. The filters drive the deck. You swipe right on the place, not the realtor. Stack: v0 for the full UI rebuild ElevenLabs Conversational AI for Vinny using multiple real-time tools, context updates, and prompt overrides. Static Craigslist listings (live scraping would violate their TOS, but a real version would pull live data) Try it: cribswipe.com

Repo Demo

Hack #6: Zed

with Zed

+150

Mad Lib Music turns pure chaos into fully produced songs. You answer a few ridiculous questions out loud — "dancing raccoon," "spaceship," "pickle" — and thirty seconds later you have a sea shanty about your roommate's ex, or a Broadway number about taxes, or a trap song about pickles. I built it during ElevenHacks to answer a simple question: what happens when you take the randomness of Mad Libs and force it through a state-of-the-art music generation pipeline? The result is a complete, genre-specific track generated end to end from whatever absurdity you fed it, with the classic Mad Libs mechanic intact except the output is a real song you actually want to send to the group chat. Three ElevenLabs features do the heavy lifting. A voice agent collects the inputs conversationally, so it feels like a game show (users can use a form if preferred). Those answers get reshaped into a structured lyrical prompt that emphasizes the spirit and mechanics of the Mad Libs game, then the music API generates the track and returns timestamped lyrics, which drive a lightweight web frontend for perfectly synced playback and shareable outputs. The whole pipeline — conversational input, structured prompt, genre-conditioned generation, synchronized output — was built in Zed. Most AI music tools start with intention. Mad Lib Music starts with chaos, and that constraint is exactly what makes every output surprising, personal, and coherent enough to be genuinely funny.

Repo Demo

Hack #5: Kiro

with AWS Kiro

+150

Your AI agent just got a phone number, and it can call anyone on Earth, in any voice, the moment you ask. OpenCawl.ai is a telephony layer for OpenClaw that lets you trigger calls three ways: tell your agent in chat ("call the dentist and reschedule for Thursday"), fire one off from the OpenCawl UI, or let an automated workflow dial out on its own. Inbound works too, so your agent answers when someone calls its number. Transcripts, outcomes, and live status stream back to both your agent and the UI so you always know what happened on the other end of the line. Built Cloudflare-native with ElevenLabs and Kiro doing the heavy lifting. ElevenLabs powers every conversation through initiation webhooks for dynamic context injection, mid-call tool use for real actions, and the full voice library so your agent can sound like anyone. The entire codebase came together across multiple passes with Kiro.Dev, which let me spec, scaffold, and cleanly refactor the telephony orchestration instead of duct-taping my way to a submission. Two sponsors, two perfect fits.

Repo Demo

Hack #4: turbopuffer

with turbopuffer

+150

WikiSounds: Paste any text. Get its unique song. What does a job rejection letter sound like? A Reddit argument? Your company's about page? WikiSounds finds out. We indexed 1,100+ music genre descriptions from Wikipedia's entire documented history of human musical expression into turbopuffer. Paste anything, and we semantically match it against that curated vocabulary... post-punk, Tuvan throat singing, vaporwave, zydeco, everything humans have ever named and described... then feed the matched genres into ElevenLabs to generate a custom 30-second track. The key insight: without the retrieval layer, you're just free-prompting a music generator with raw text. With turbopuffer in the middle, you're grounding generation in real musicological knowledge. Your Monday standup notes don't just become "office music," they become a specific blend of jumpstyle and juke, because that's what the semantic space says your words actually sound like. How it works: Paste anything: a poem, a Slack thread, a recipe, your inner monologue Your text is embedded and matched against Wikipedia's music genre taxonomy via turbopuffer Matched genres plus a summary of your text become a rich music generation prompt ElevenLabs generates a unique 30-second track Share it with a permanent link Stack: React + Cloudflare Workers + turbopuffer + OpenAI embeddings + ElevenLabs Music API

Repo Demo

Hack #3: Replit

with Replit

+200

Get pep talks from Future You, the AI that calls you as if it is the person you're trying to become. Future You uses ElevenLabs Voice Cloning to capture your voice, then calls you before your workout as your future self, the version of you who already crushed your goals. It pulls real biometric data from your Fitbit (sleep score, HRV, resting heart rate, active zone minutes) and weaves it into a personalized coaching call. After your workout, Future You calls back with a debrief based on what actually happened. We used ElevenLabs Voice Cloning for the Future You persona, Voice Design for multiple guest coach personalities (drill instructor, zen master, 80s aerobics icon, and a no-nonsense commander), and Conversational AI with ElevenAgents for interactive call sessions. We use overrides for prompts and first messages as well as conversation initiation webhooks for inbound calls. We use Twilio for phone conversations and ElevenLabs API for web conversations. Built entirely with Replit Agent 4 with shockingly few prompts. The moment that makes it real: hearing your own voice tell you "You slept 5 hours and 40 minutes. Your HRV is down. I know you want to skip. I made that call once. Don't, or you never become me." That's not a notification. That's motivation.

Repo Demo

Hack #2: Cloudflare

with Cloudflare

+200

CallWiz is a voice scheduling assistant that finds meeting times across different booking systems like Calendly, Cal.com, Google Calendar, and Workmate, then books the multi-way meeting through a single conversation. It solves the painful back-and-forth of group scheduling by letting someone talk naturally, ask for different dates, and keep refining options without restarting the workflow. On the Cloudflare side, I use Workers and Durable Objects as our agent orchestration layer to maintain session state, coordinate tool calls across the conversation, paginate and refresh availability over time, and make the scheduling agent reliable across both web and phone voice channels. For ElevenLabs, I have a custom agent with initiation webhooks, a cloned voice (my own), client and server tools, and agent settings overrides. Tons of fun building this!

Repo Demo

Hack #1: Firecrawl

with Firecrawl

+100

Quantext is like TV with a commentator who actually did the homework. The problem: YouTube is full of confident claims and hot takes. Watching alone means you either swallow it or pause and Google forever. I wanted watch-together energy: someone in the room who can fact-check, rebut, or roast at the right beat—then get out of the way. What I built: Paste any YouTube URL. The app pulls the transcript (or transcribes), uses an LLM to find high-signal moments (stats, bold claims, missing context), then Firecrawl runs parallel search + scrape across those moments so every interjection ships with real sources, not vibes. At playback, an ElevenLabs Conversational AI agent fires on timestamp—pause → speak → auto-resume—so it feels like the video was edited to include commentary. Ask a question mid-watch and the agent can search again live via a custom Firecrawl client side tool, so ad-hoc rabbit holes don’t break the illusion. Clever Firecrawl use: Research isn’t one big dump. It’s per-claim, pre-orchestrated so interjections are instant at runtime, plus on-demand when the user interrupts. That’s the hack: pre-compute the expensive part, keep Firecrawl in the loop when the human goes off-script. Clever ElevenLabs use: This isn’t TTS reading a script. It’s a full agent session with tooling (pause/resume the player, live search) and persona-conditioned behavior. I cloned my voice and fed roughly a decade of my stand-up comedy writing into one of the agent’s world model so “me” sounds like my timing and voice—not a generic narrator. On top of that: 23 distinct commentator personas (e.g. Grandma Gloria, Rage Rick, Sir David Attenburro, Ackshually Alex, Conspiracy Carl) and two modes—The Full Picture (sourced fact-checking) and The Fool Picture (MST3K-style roasting)—so the same pipeline feels like a roster of shows, not one bot. Engineering that matters: Caching (transcripts, prep artifacts, session payloads) so repeat watches and shares don’t re-pay the full prep tax. Share links so you can send a URL that opens the same video, persona, and mode—demoable, viral-friendly, and actually usable after the hackathon. Why it’s a flex: It chains Firecrawl’s research graph with ElevenLabs’ realtime voice + tools in a loop built around real video time—not a chat window dressed up as a product. The “wait, it stopped the video and cited that?” moment is the whole thesis.

Repo