I build and ship AI products in an environment where the frontier changes weekly, so my planning system has to be adaptive, evidence-driven, and unapologetically outcome-focused. In this piece, I share the frameworks I use to set goals for generative AI, balance research with product execution, and scale responsibly — drawing sharp lessons from one of the most influential applied AI companies operating today.
Consider Runway, an applied AI research company shaping the next era of art, entertainment, and human creativity. Runway has raised $237m and was one of Time Magazine’s “100 most influential companies” in 2023. Runway has been a persistent viral sensation in recent years, and is behind many of the most famous AI demos online.
The earliest stages of an AI company often begin with research breakthroughs, scrappy prototypes, and clever distribution. In practice, that means leveraging containerization (https://aws.amazon.com/what-is/containerization/) and Docker (https://www.docker.com/) to package models reproducibly, showcasing work where practitioners already gather — Hugging Face (https://huggingface.co/), Hugging Face Spaces (https://huggingface.co/spaces), and Hugging Face Model Hub (https://huggingface.co/docs/hub/models-the-hub) — and tapping infrastructure like Replicate (https://replicate.com/) to get demos into people’s hands. Early, magical use cases — like the Green screen tool by Runway (https://runwayml.com/green-screen/) — teach us which problems are both technically feasible and viscerally valuable.
I’ve learned to be cautious about “The limitations of being “customer-driven” when building in AI”. Traditional product discovery assumes needs are legible and solutions are relatively deterministic. In generative AI, user desire often follows model capability, not the other way around. The job is to triangulate: run tight user loops to validate perceived value, instrument objective model quality, and explore novel interaction patterns that customers can’t yet articulate. I treat this as a portfolio of discovery bets — some customer-led, some capability-led, all evaluated against clear outcome thresholds.
Balancing research development with product development requires organizational design that prevents context-switching tax while preserving velocity. I pair research pods with product pods, supported by forward deployed engineers and domain PMs who translate evaluation metrics into user-visible milestones. Safety and content moderation sit on the critical path, not as afterthoughts — think policy definition, classifier tooling, abuse red teaming, and clear escalation playbooks. This balance is how you move from a great demo to a dependable product without losing momentum.
Goal-setting amidst constant change in AI starts with outcomes vs output OKRs. I write OKRs in terms of user impact and model performance thresholds — for example, target ranges for latency, quality scores against a golden dataset, or creator retention — then let teams choose the highest-leverage outputs (data pipelines, fine-tuning, UX improvements) to get there. Why I don’t plan very far ahead: I treat the annual view as a vision and bet map, the quarterly view as a constrained slate of outcomes, and the 6–8 week cycle as the execution heartbeat. AI roadmaps are hypotheses; evaluation harnesses and launch gates are the truth.
Community is a force multiplier. Forming a vocal community and fostering community requires real access and real listening: early release cohorts, office hours, and transparent changelogs. How they picked users for early release matters — diversity of use cases, sophistication of workflows, and willingness to give crisp feedback. Expanding past the first 100 users of Gen-2 demands readiness: evaluation parity across modalities, scalable infra, and safety coverage. Done well, this motion compounds learning while building authentic advocacy.
For founders, my advice echoes the core lessons above. Start with a narrow, high-intent wedge and prove durable value fast; let founder-led GTM compress the feedback loop; instrument everything from day one; and resist the urge to over-plan features before you’ve nailed outcomes. Product-market fit lessons in AI often arrive via small, fast experiments — not grand, long-range plans. Ship thin slices that demonstrate unmistakable value, then iterate toward a system, not a single feature. When in doubt, shorten the loop and improve the evaluation harness.
People often ask: Will AI replace video editors? My view is that AI will replace zero editors who master these tools — and many who don’t. The winners blend taste, storytelling, and generative leverage. The products we build should honor this reality: design for control, iteration, and co-creation, not just automation.
If you’re mapping the progression of tech and use-cases, a few public references are instructive: Runway Gen-1 (https://research.runwayml.com/gen1) and Runway Gen-2 (https://research.runwayml.com/gen2) show how capability unlocks new workflows and demand. Runway’s 30 AI Magic Tools (https://runwayml.com/ai-magic-tools/) illustrates portfolio thinking — a suite of composable powers rather than a monolith.
For builders focused on gen ai for product prototyping through production: keep your demo muscle strong, your evaluation stronger, and your outcomes strongest. Invest in community, treat safety as a feature, and let your OKRs steer what ships — not the other way around.












Leave a Reply