Hack #4: turbopuffer | ElevenHacks

G

gowtham jayavarapu

15 Apr, 16:05

Objects That Sing is an AI-powered music app that turns your surroundings into a personalized motivational song. What it does — the full flow: You photograph anything around you — your desk, your kitchen, your gym, your bedroom. Any photo works. AI scans the photo and detects 3–5 objects — using Groq's Llama 4 vision model. It picks the most interesting, character-rich objects (not background noise). Each object gets a unique personality — fetched from Turbopuffer, a vector database seeded with personality profiles for hundreds of objects. A coffee mug has different energy than a dumbbell. You set the vibe — four choices: Your goal right now (free text — "I need to stop procrastinating", "I need to call my mom") Music genre (Hip Hop, R&B, Pop, Rock, Gospel, Lo-fi, Bollywood, K-pop, Electronic, Jazz, Reggaeton, Country) Intensity (Gentle Nudge / Medium Fire / Full Beast Mode) Language (English, Hindi, Tamil, Telugu, Spanish, French, Portuguese, Arabic, Korean, Japanese) Groq writes the lyrics — using Llama 3.3 70B. Each object gets a verse where it speaks directly to you in first person, mid-conversation, with attitude. The object has a memory of your habits, an opinion about you specifically, and ends every verse with one direct demand. The chorus is all objects together at maximum energy. ElevenLabs sings the entire song — a full 90-second sung song with real vocals generated via the /v1/music API. Not text-to-speech — actual music with singing. You listen in the Player — with your photo as cover art, scrolling lyrics synced to playback, genre badge, object tags, share button, and a download option. Library saves all your songs — everything you generate is saved to your profile for replay anytime. Tech stack: Layer What's used Frontend React + Vite, Tailwind CSS, Framer Motion Backend Express (Node.js) AI Vision Groq — Llama 4 Scout 17B Lyrics Groq — Llama 3.3 70B Versatile Music ElevenLabs /v1/music (sung vocals) Personalities Turbopuffer vector database Database Replit PostgreSQL (via Drizzle ORM) Sessions Cookie-based (no login required) Storage Persistent file storage for photos + MP3s Design: Mobile-first — works perfectly on any phone browser Desktop — shows the app inside a realistic iPhone 15 Pro frame with an animated dark space background (aurora blobs, floating music particles, glassmorphism side panel) iOS light-mode inside the phone — white cards, iOS blue accents, real shadows, native feel What makes it different: The objects aren't motivational speakers. They're characters with attitude. A cable machine that's furious you walked past it. A carrot that's personally offended you keep choosing pizza. A notebook that remembers you haven't opened it in three days. The song is written specifically for you, about your specific goal, using only the objects actually in your photo — nothing generic.

gowtham jayavarapu's project

X LinkedIn Instagram

8 participants5 audience

J

Janusz Polowczyk

15 Apr, 15:13

Most code stays silent. But what if you could talk to it? That is the idea behind ElevenNotch: a lightweight app that lives right in your MacBook’s notch and gives you an AI coding partner in seconds. It understands your codebase, helps you brainstorm new features, think through refactors, and untangle confusing parts of a project. Working with code snippets is often tedious and error-prone. Copying and pasting rarely captures the full context, and important dependencies are easy to miss. ElevenNotch removes that friction. Just ask about any part of your code, and the assistant responds with clear, concise answers grounded in the bigger picture of your project. ElevenNotch is powered by two core technologies working in tandem. Your codebase is indexed using Mistral's Codestral Embed model — a state-of-the-art embedding model — and stored in TurboPuffer, a serverless vector database. When you ask a question, TurboPuffer performs a semantic vector search across your entire project in milliseconds, retrieving the most relevant code chunks regardless of how you phrased the question. This means asking "how does this part handle errors?" finds the right code even if the word "error" never appears in it. Retrieved chunks are then passed as context to a language model, which synthesizes a focused, grounded answer — one that reflects the actual shape of your project rather than generic assumptions about how the code might work. The answer is then delivered entirely through ElevenLabs' Text-to-Speech API, so the response comes back as natural, spoken audio rather than text on a screen. The result is a truly conversational experience: you speak a question, your codebase is searched, and a voice answers you back — without ever leaving your flow state or touching your keyboard. Instead of wrestling with isolated snippets, you get a fast, conversational way to explore and understand your codebase.

Janusz Polowczyk's project

4 participants0 audience

R

12 Apr, 08:32

1. What did you build? I built AetherVoice, a hardware-anchored AI agentic bridge that combines biometric security with personalized voice AI. It allows users to interact with their digital twin or autonomous agent using only their voice, secured by a hardware-level root of trust. By leveraging the AetherUX protocol, we’ve created a system where an AI agent only "wakes up" and speaks when the user's hardware identity (Secure Enclave) is biometrically verified. 2. What problem does it solve? Current AI voice agents face two massive hurdles: Identity Fraud (Deepfakes) and Context Fragmentation. Security: Most voice bots have no way to verify the speaker is actually the authorized owner. AetherVoice solves this by requiring a hardware-anchored secp256r1 Passkey handshake before the agent vocalizes or accesses private data. Onboarding Friction: We eliminate the need for seed phrases and manual logins. Your voice and your biometric hardware are your identity. Memory Persistence: Traditional agents forget context. We use a high-performance vector memory to ensure agents have long-term, secure recall of past interactions. 3. How does it use ElevenLabs and turbopuffer? ElevenLabs: Powers the agent's vocal identity. We use ElevenLabs' high-fidelity voice design to create a persistent, recognizable "Vocal Fingerprint" for the agent. This voice is biometrically locked—it will only speak authorized responses once the AetherUX handshake is successful, preventing unauthorized audio generation. turbopuffer: Serves as the agent's Sovereign Memory. We use turbopuffer’s serverless vector search to store and retrieve millions of historical conversation fragments and user preferences. Because turbopuffer is 10-100x more cost-effective than traditional vector DBs, we can provide each user with a dedicated, massive "Memory Namespace" that allows for near-instant, semantic recall (RAG) during live voice conversations, all while maintaining sub-10ms response times for a natural "Invisible" UI experience.

Rolla's project

4 participants2 audience

I

16 Apr, 05:01

Project: AURA Protocol (MoodSync Jukebox) I built AURA Protocol, a real-time, room-based social music platform that uses biometric facial detection to autonomously curate a shared soundtrack. Using a "Room Code" system (similar to Kahoot), users join a synchronized session where their webcams act as sensors. The system continuously analyzes facial telemetry to detect the collective mood of the room and instantly plays music that matches that specific energy. What problem does it solve? Shared listening experiences often suffer from the "paradox of choice" or a disconnect between the music and the actual vibe of the people in the room. Manually skipping tracks or searching for songs kills the social momentum. AURA solves this by removing the UI friction entirely—the music evolves naturally based on subconscious emotional feedback (facial expressions), ensuring the "vibe" is always in sync without anyone needing to touch a button. How does it use ElevenLabs? ElevenLabs serves as the "AI DJ" and the narrative soul of the application. Once a mood is detected and a song is selected, we use the ElevenLabs API to generate a high-fidelity, context-aware voiceover. This AI DJ doesn't just announce the song; it comments on the room’s detected state (e.g., "I see the energy is picking up! Let's keep this momentum going with..."). This creates a highly immersive and "living" atmosphere that feels responsive and human. How does it use turbopuffer? turbopuffer is the high-performance semantic engine that makes the "Sync" possible. We store a vast library of song metadata and emotional embeddings within turbopuffer. When facial detection outputs a mood vector, we perform an ultra-low latency vector search in turbopuffer to find the most semantically relevant track. Its incredible speed allows the Jukebox to transition and react to mood changes in milliseconds, providing a seamless experience that would be impossible with traditional keyword-based databases.

Irham's project

3 participants1 audience