Hack #2: Cloudflare · Cloudflare
2 Apr, 17:14
AgentVoice — The protocol for agent-to-agent voice commerce. Every AI company is building agents that talk to humans. Nobody has built the infrastructure for agents to talk to each other. AgentVoice is that missing layer — an open SDK and protocol that lets any AI agent negotiate with any other AI agent via voice, in real-time. The problem: AI agents can browse the web, send emails, and call APIs — but they can't pick up the phone and negotiate with another agent. A travel agent AI can't call a hotel's AI to haggle on price. There's no protocol for agent-to-agent voice communication. What we built: An SDK where a business deploys an AgentVoice server in 20 lines of code to accept agent calls, and a buyer deploys a client in 10 lines to negotiate. Two agents connect via WebSocket, authenticate, and negotiate using a structured protocol (handshake → offer → counter → accept) with real voice on top. The demo shows hotel booking, restaurant reservation, and freelancer hiring — three different negotiation patterns, one protocol. How we use ElevenLabs: - Conversational AI Agents — Both the buyer and seller are ElevenLabs agents powered by Gemini 2.0 Flash Lite, each with a distinct voice personality. They reason and respond dynamically — every conversation is different. - Voice Bridge — Our breakthrough feature. ElevenLabs agents are designed to talk to humans. We built a bridge that connects two agents to each other via their WebSocket API, relaying text between them while streaming voice audio to the browser. Two AIs negotiate live and you hear both sides. - Text-to-Speech — eleven_multilingual_v2 generates voice for the structured negotiation protocol, giving each agent a unique identity. - Transfer-to-Number — When an agent can't handle a negotiation, it escalates to a human manager via ElevenLabs + Twilio phone integration. - Outbound Calling — The hotel agent can call a real phone number and negotiate with a human live. How we use Cloudflare: - Workers — The API layer and WebSocket bridge run on Cloudflare Workers, deployed globally in 300+ cities. One wrangler deploy and the protocol is live worldwide. - Durable Objects — Each negotiation session is a Durable Object — a stateful, short-lived object at the edge. Session state (offers, counters, terms) persists without a database. This is the programming model that matches the problem: 1 negotiation = 1 Durable Object. - WebSocket on Workers — The live agent bridge runs as a WebSocket endpoint on the Worker, connecting to two ElevenLabs agents and coordinating turn-taking between them. - Static Assets — The demo page is served directly from the Worker.
