thisisnt.ai / monologue / live / cwc-sf-2026
past event · recordings available

Code with Claude — San Francisco 2026

Every session on one page. Two parallel streams · 19 sessions · 9 of these were re‑aired at London on May 19 (look for the London replay badge).
date Wed, May 6, 2026
venue San Francisco, USA
format In‑person + livestream · 2 stages
recordings Available on each session page
source claude.com/code-with-claude/san-francisco
extended (May 7) SF extended
London replay May 19, 2026

Day timeline

09:00 – 10:00Opening keynote
10:30 – 12:30Morning sessions (2 streams)
13:00 – 13:30Dario & Daniela fireside
13:50 – 17:35Afternoon sessions (2 streams)
17:35 – 18:05Closing parallel sessions

Schedule (2 stages in parallel)

Time
Stream A
Stream B
09:00
10:30
11:15
12:00
13:00
13:50
14:35
15:20
16:05
16:50
17:35

All sessions · in time order

"GitHub's Copilot team ships Claude to millions of developers across chat, CLI, coding agent, and code review, and has become one of the most demanding users of the Claude Platform. GitHub CPO Mario Rodriguez and Anthropic's Brad Abrams break down how the team pushes quality up and costs down at scale, from caching and evaluation to the new Advisor strategy."

11:15 – 11:45 Claude Platform

"Over the last year, capabilities that used to require heavy scaffolding have moved into the model: reliable tool use, context management, writing and running code, computer use, and more. This session walks through these capabilities, shows what changed between model generations, and demonstrates how they compose into agents that finish work instead of just starting it."

"Building agents used to mean spending development cycles on secure infrastructure, state management, permissioning, and reworking your agent loops for every model upgrade. Managed Agents, on the Claude Platform, now handles that layer for you. This session covers how to build and deploy a production‑grade agent at scale."

"When Opus 4.5 landed, v0 was ready on day one — not by luck, but by design. Guillermo Rauch sits down with Angela Jiang at Anthropic to unpack how Vercel architects for model step‑changes: the bets that paid off, the ones that didn't, and what becoming an 'agent‑pilled company' actually looks like inside a frontier platform team."

"At Datadog, 90% of engineers adopted AI coding tools for production work in the last four months, with Claude Code driving two‑thirds of that usage. As sessions grew more ambitious, the reusable tools they produced — for verification, debugging, orchestration — sprawled into unmaintainable one‑offs. Sesh Nalla, VP of Engineering, shares how Datadog built Temper: a constrained framework that produces secure, reusable tools that compound across sessions and teams instead."

"Most teams shipping AI products can't build evals that predict how a model will actually perform in production. Michele Catasta, President & Head of AI at Replit, shares how his team closed that gap with ViBench — a public vibe‑coding benchmark that scores whether the generated app works — and the offline/online evaluation loop behind Replit Agent that turns weeks of engineering into compounding overnight gains. Anthropic's Hannah Moran joins to share what separates evals that look rigorous from ones that actually help teams adopt new models with confidence."

"Cut cost, manage context, boost intelligence. In this session, we'll show you how to put our latest platform capabilities to work. Through live demos you'll see what great prompt caching looks like, learn to keep context lean for long‑running agents with tool search, programmatic tool calling, and compaction, and use the advisor strategy for a cost‑effective intelligence boost."

"Local agents hit a ceiling — they compete for your machine's resources, can't verify their own work, and bottleneck at one or two tasks at a time. Alexi Robbins, Head of Engineering for Cursor's async agents, share how they gave each agent its own isolated VM so agents can write code, spin up browsers, test their own changes, and deliver merge‑ready PRs in parallel — now behind 30%+ of Cursor's internal merged PRs."