Ai Agents 3 min read

Parallel Search Powers Sesame's New iOS Voice Agent App

The Oculus founders' startup Sesame has launched a public preview iOS app featuring low-latency voice agents driven by simultaneous parallel search.

On May 28, 2026, Sesame released its public preview app on iOS. Founded by the original creators of Oculus, the company brings its low-latency conversational agents to mobile devices after a million-user research preview. The release introduces a technical architecture designed to solve the standard tension between retrieval depth and conversational speed.

Parallel Search Infrastructure

Most conversational AI systems follow a linear execution path. They receive a prompt, execute remote web searches, await the retrieval results, generate a text response, and synthesize the final audio. Sesame’s infrastructure bypasses this bottleneck by utilizing a parallel search system that executes multiple web queries simultaneously while the agent has already begun speaking.

This concurrency allows the agent to alter its responses mid-sentence. If incoming search results introduce new facts or contradict an initial assumption, the system adjusts its ongoing speech synthesis to pivot its train of thought in real-time. For engineers who stream LLM responses, this execution model requires advanced handling of dynamic token invalidation and audio regeneration. The result is a natural conversational cadence that masks the underlying retrieval latency.

Agent Memory and Privacy Controls

The free iOS application is currently available in 39 countries. Users interact with four distinct AI personas: Maya, Miles, Simone, and Charlie. Each agent maintains persistent memory across sessions to build continuous context. The app supplements the primary voice interface with text-based messaging and renders Visual Search Cards that display real-time image results alongside the audio stream.

To support privacy during sensitive queries, Sesame built a dedicated Incognito mode. When activated, the agent can still read prior session context to understand the current conversation, but the system prevents any new data from being written to Sesame’s servers or the agent’s long-term storage. If you work on agent memory, this dual-state architecture demonstrates a practical pattern for balancing personalized interactions with strict data minimization. The system also includes automated note-taking and reminder scheduling triggered directly by the conversational flow.

Hardware Requirements and Platform Roadmap

The Sesame app requires iOS 18.0 or higher. Apple Vision Pro users can install the application on visionOS 2.0 or later. Backed by a $250 million Series B funding round led by Sequoia and a16z, the company has confirmed an Android version is in development.

The current mobile application functions as an intermediate step toward dedicated hardware. Sesame plans to release intelligent eyewear in 2027, moving the interface off smartphone screens entirely. Future software updates will also transition the underlying models from pure conversational retrieval into action-oriented systems capable of executing external digital tasks. Developers building real-time voice agents will face increasing competition from these hardware-integrated approaches.

If you are designing voice-first AI interfaces, Sesame’s mid-sentence pivot capability changes the baseline expectation for latency and natural flow. Evaluate your retrieval pipelines to determine if concurrent search execution can hide your system latency better than optimizing sequential speech synthesis models.

Get Insanely Good at AI

Get Insanely Good at AI

The book for developers who want to understand how AI actually works. LLMs, prompt engineering, RAG, AI agents, and production systems.

Keep Reading