How We Built an AI Sleep Coach: CBTI, Voice AI, and a Product Playbook for Better Rest

Podcast cover art reading 'Just Now Possible' with a teal neural network graphic and a yellow banner that says 'Building an AI Sleep Coach @ Rest' on a dark blue background, tech and sleep theme.

What if your morning started with a helpful check-in from a voice AI that actually improves your sleep—using the same core principles that typically cost thousands of dollars and come with year-and-a-half waitlists? That idea energizes me as a product leader, because it blends clinical-grade outcomes with consumer-grade accessibility. Recently, I dug into how the team at Rest built an AI sleep coach inspired by Cognitive Behavioral Therapy for Insomnia (CBTI), and why their method offers a repeatable blueprint for complex, personal AI products.

The origin story is a classic product discovery moment. Rest’s team noticed that a meaningful slice of users in their podcast app were using audio to fall asleep. Although it represented only about 10% of users, that group showed a high willingness to pay. That signal pushed them to explore a dedicated sleep solution, moving from a general audio app to a targeted sleep experience—and eventually toward an AI-powered coach as LLMs matured.

Through jobs-to-be-done research, they identified a clear, underserved segment: “DIY sleep hackers.” These are motivated users who want agency, structure, and results without navigating clinical systems. Choosing CBTI (a clinically proven approach with 80% efficacy) gave the product a strong evidence-based foundation while remaining accessible as a wellness tool. It’s the kind of strategic choice I look for: credible, measurable, and aligned with user motivation.

The product evolution moved in smart, incremental steps. Rest started with a basic text chatbot before graduating to a voice-first experience—using Vapi for voice and OpenAI for reasoning. Voice changed the relationship dynamic: it increased intimacy, lowered friction for daily check-ins, and made behavioral coaching feel human without pretending to be. The team built a memory system that tracks context (like traveling or having a dog) with time-based relevance, which keeps conversations fresh, respectful, and genuinely personalized.

Daily engagement is driven by dynamic agendas that adapt based on sleep data, the user’s stage in the program, and their recent compliance. I love this mechanic: it operationalizes behavior change by sequencing the right intervention at the right time. In parallel, they developed text via OpenAI Assistants while building voice with Vapi, which let them ship value while learning in two modes. They also moved from massive system prompts to RAG for general sleep knowledge, keeping personal user context in the prompt—reducing brittleness while improving scalability.

Because sleep sits close to healthcare, the team drew a firm line between wellness and medical positioning. They implemented clear guardrails: no diagnosis, no medication advice, and strong boundaries on scope. Weekly error analyses with domain experts (sleep therapists) tightened quality and tone, and they adopted LLM-powered evals to enforce safety boundaries. For observability and evaluations, they leveraged Langfuse, and they experimented with Hamming for voice testing to refine the experience end-to-end.

Under the hood, this is a great example of “one bite of the apple at a time” product building in AI. Start with a simple interface, anchor on an evidence-based method, layer personalization with memory, formalize program structure with dynamic agendas, and shift to RAG when general knowledge outgrows prompt engineering. As a product leader, I see strong echoes of agentic patterns here—goal-oriented orchestration, stateful memory, and adaptive planning—shipped in pragmatic increments rather than as a monolithic platform rewrite.

A few takeaways I’m applying with my teams: First, segment deeply and pick a high-intent niche (those “DIY sleep hackers” were the right beachhead). Second, let modality fit the job—voice is not a gimmick when it boosts compliance and empathy. Third, design safety and scope from day one if you’re anywhere near health. Finally, invest early in evals and observability so you can improve with confidence, not hope.

If you want to explore the full conversation and product decisions, you can listen here: Spotify | Apple Podcasts.

Resources & Links:

Rest – AI sleep coach app

Vapi – Voice agent platform Rest uses

Langfuse – Observability and evals platform

Hamming – Voice testing platform

AI Evals Maven Course by Hamel Husain and Shreya Shankar

Bottom line: Rest demonstrates how to take a clinically grounded method like CBTI, translate it into a daily voice-first experience, and ship it with rigor. If you’re building in AI, this is a model worth studying—practical, safe, and deeply user-centered.


Inspired by this post on Product Talk.


Book a consult png image

What approach underpins Rest’s AI sleep coach?

Rest’s AI sleep coach is grounded in CBTI, a clinically proven approach with about 80% efficacy. It’s positioned as a wellness tool to keep the guidance accessible while maintaining credibility.

How did the product evolve from text to voice?

They started with a basic text chatbot and then moved to a voice-first experience using Vapi for voice and OpenAI for reasoning. They added memory to track context and keep conversations personalized.

What is the role of memory in the AI coach?

They built a memory system that tracks context (like traveling or having a dog) with time-based relevance. This helps keep conversations fresh, respectful, and genuinely personalized.

What are dynamic agendas and why do they matter?

Dynamic agendas drive daily engagement and adapt based on sleep data, the user’s stage in the program, and recent compliance. This mechanism sequences the right intervention at the right time to support behavior change.

How does Rest handle safety and medical boundaries?

They drew a clear line between wellness and medical positioning, with guardrails on scope. Weekly error analyses with sleep therapists and LLM-powered evals help enforce safety and quality.

What observability tools are mentioned?

Langfuse is used for observability and evals, and Hamming is used for voice testing to refine the experience end-to-end.

Where can you listen to the full conversation?

Listeners can find the full conversation on Spotify and Apple Podcasts. The post invites readers to explore the full conversation there.

Comments

Leave a Reply

Your email address will not be published. Required fields are marked *

Signup for Weekly Digest Emails

Categories

Archieve