Aditya Singh

2,100 points · 8 submissions

Submissions

with Cursor

+200

Atlas is a voice-first AI assistant for your desktop. You operate your entire computer just by talking — open apps, take notes, read your screen, see through your camera, generate images and music, find files, control your system. Sixteen tools, all voice-callable, running natively on macOS, Windows, and Linux. The problem: every computing interface today assumes you can use a keyboard. That's a friction tax for everyone and a wall for the 250M+ people globally with motor impairments. Existing voice control tools are pre-LLM and feel decades old. Atlas is built on four ElevenLabs primitives — Conversational Agent for orchestration, Scribe v2 for sub-150ms speech-to-text, Flash v2.5 for streaming TTS, and Instant Voice Clone so Atlas speaks in a voice you actually chose (a partner, a mentor, anyone meaningful). Half-duplex mic gating, native interruption, sub-second total latency. The voice loop just works. Cursor shipped most of the code with me — a cross-platform Tauri 2 desktop app (Rust core + React frontend) with a Cloudflare Workers backend, built and polished in seven days solo. Available now: landing-roan-psi-56.vercel.app Talk to your computer. It does everything.

Repo Demo

X LinkedIn Instagram TikTok

Submitted 14 May 2026

Hack #7: v0

with v0

+400

NASA · Transmit is an editorial cinematic redesign of nasa.gov. Every word stays verbatim from the live site — every headline, every Artemis II crew name, every mission. The rendering is reimagined as a magazine you fall into. Same content. Different ceremony. Three ElevenLabs integrations: ↳ Pre-generated voice tour (TTS API) ↳ Composed ambient soundtrack (Music API, 2:30 instrumental) ↳ Conversational AI agent — Mission Control voice that answers live questions about the ISS Built and scaffolded with v0. Vite, vanilla JS, no framework, no design library. 16kb before first paint. Signature moments: a hard contrast flip from cosmic dark to warm cream paper with an embossed wax seal. A horizontal scroll-jack from low Earth orbit to Voyager 1 with a tension drone rising in pitch. A cursor flashlight section revealing hidden constellations. The constraint of "you cannot change a single word" was the most generative brief I've worked from. Live: nasa-eight-smoky.vercel.app

Repo Demo

Hack #6: Zed

with Zed

+150

OUTRUN is a synthwave endless runner with one twist that changes everything: the AI radio DJ chants whatever you type into the menu. Type ADITYA, see those six letters spawn on the runway, hear them chanted back at you in the same voice that greets you by name and reads your final stats aloud. Three sectors, escalating speed, a cyber shop between levels, gold shield power-ups that wrap Rocky the pentapod in a shimmering aura, and an AI-narrated recap that reads your run back to you when it's over. Every voice line, every sound effect, every music track — generated by ElevenLabs. Built entirely in Zed.

Repo Demo

Hack #5: Kiro

with AWS Kiro

+350

Bound turns any photographed book page into an immersive audio scene — narrator voice cast to the genre, character voices for dialogue, ambient sound built around the scene, music following the emotional arc. Not an audiobook. Scene rendering. Built spec-first in Kiro (6 feature specs, 4 steering docs, 2 agent hooks, ElevenLabs Kiro Power). Uses all four ElevenLabs audio APIs — Voice Design, TTS v3, Sound Generation, Music. If the phone is what takes us away from books, the phone can be what brings us back. Try it: https://bound-590954766263.us-central1.run.app

Repo Demo

Hack #4: turbopuffer

with turbopuffer

+150

Resonar is a social platform where humans share raw audio stories and AI connects strangers through what they feel. People record 30-second to 3-minute voice notes about their real lives — the mundane, the vulnerable, the funny, the heavy. AI does the rest: ElevenLabs Speech-to-Text transcribes each story, Gemini extracts the emotional core and writes custom Music + SFX prompts tailored to that exact story, ElevenLabs Music API and Sound Effects API generate an original atmospheric layer that sits underneath the voice, and ElevenLabs Text-to-Speech narrates the Daily Resonance — a podcast episode assembled from the day's collective mood. Built solo in 72 hours using Next.js, turbopuffer (aws-ap-south-1), all 5 ElevenLabs APIs (STT, TTS, Music, SFX, Speech-to-Text batch), Gemini Flash 2.0, and Cloudflare R2.

Repo Demo

Hack #3: Replit

with Replit

+150

Whisper is an app where you point your camera at literally anything — your coffee mug, a plastic bottle, a tree, your own phone — and it speaks to you with a unique AI-designed voice, real personality, and genuine knowledge pulled from the web in real-time. Your coffee mug speaks in a warm, scratchy, world-weary voice: "Another Monday. You always grip me tighter on Mondays. The coffee inside me traveled 6,200 miles from Sidamo, Ethiopia. A woman picked those beans by hand last October. Slow down. Taste it. For her." Your houseplant is passive-aggressive about being in the corner. Your running shoes guilt-trip you about the 11 days since your last run. A 90-year-old tree tells you about the proposal it witnessed in 1987. But here's where it gets powerful: point at a plastic bottle and the OCEAN speaks — in a vast, ancient, tired voice — telling you what plastic is doing to its body, with real statistics, then suggesting the refill station 0.3 miles from you. Point at fast fashion and the cotton field tells you about the sea that was drained to grow it. Environmental awareness through empathy, not guilt.

Demo

Hack #2: Cloudflare

with Cloudflare

+150

Echoverse is a real-time multiplayer platform where people collaboratively build immersive audio worlds using only their voices. Players speak commands to generate sounds, layer environments, and then step INTO those worlds as voice-transformed characters — creating live audio dramas, interactive soundscapes, and collaborative stories that are recorded, rendered, and shareable. Say "rain." You hear rain. Say "I'm an old man by the fire." Your voice transforms. Now you're a character inside a world you built together with strangers, in real-time, from nothing. When you're done, the scene is a produced audio piece you can share, publish, or sell. Every creative platform on the internet is visual: TikTok, Instagram, YouTube, Figma, Canva. Audio creation has no equivalent — no real-time, multiplayer, accessible platform where ordinary people (not musicians or audio engineers) can create rich audio content together. Echoverse fills that gap by making voice the only tool you need.

Repo Demo

Hack #1: Firecrawl

with Firecrawl

+550

I built a voice-first time machine. It's called Kleos. Say "The fall of the Berlin Wall" — in under 4 minutes, you get a fully-produced cinematic audio documentary. Original character voices, orchestral score, sound effects, ambient soundscapes, AI illustrations. All from one spoken prompt. In any language. How it works: A Concierge Agent takes your prompt. Firecrawl researches the event. Gemini writes the script. Then ElevenLabs does everything else: Era-accurate character voices (Voice Design) Emotional multi-character dialogue (Text to Dialogue) Synced sound effects + ambient beds (Sound Effects API) Original film score with auto-ducking (Music Composition) Multilingual support — experience history in your language Instant Voice Clone for "You Were There" mode — where you become a character The part that gave me chills: Tap any character's portrait and have a real voice conversation with them. Ask Gandhi what he was thinking. Ask Armstrong what the silence felt like. They answer in character, in your language, citing real sources.

Repo Demo