Number of shipped experiments per growth-PM-month rises 3–5x without quality regression.

Automate the four stages of a growth experiment; keep humans on alignment

Claim

Run growth experimentation through a four-stage substrate — identify opportunity, build the change, test against a quality + brand bar, ship and analyze — driven by Claude. The fifth stage (cross-functional alignment) stays human, and that is the lasting bottleneck.

Mechanism

Most growth experimentation is loosely coupled steps that already have rich playbooks: ideation, implementation, QA, analysis. A capable model can drive each step end-to-end against a written brand and quality bar, with current win rates around "junior PM 2–3 years in." The expensive human input is no longer building the experiment — it is the political and aesthetic work of getting six people in a room to agree on what to ship.

Conditions

Holds when:

The team has a written quality + brand bar codified as skills with explicit dos / don'ts.
A frontier model (Opus 4.5+ in Anthropic's case) is wired to the relevant tools.
Growth output is a high-volume stream of similar experiments, not one-shot strategic bets.

Fails when:

Brand and quality guardrails are tacit, not written. The model has nothing to align to.
The work depends on novel research or category creation, not iteration on a known surface.
Stakeholder alignment was already the bottleneck — automation does not solve org-design problems.

Evidence

"Identify opportunities → build the feature → test against quality + brand bar → ship + analyze."

"We will have AGI and it will still be impossible to get six people in a room to align."

The team is led by Alexey Komissarouk inside Anthropic. Win rate today is named at "junior PM 2–3 years in." The substrate wasn't viable before Opus 4.5; it is now. Human-in-loop review need is decreasing weekly.

— Amol Avasare on Lenny's Podcast, 2026-04-05

Signals

Number of shipped experiments per growth-PM-month rises 3–5x without quality regression.
Brand-bar violations caught in pre-ship review trend down to a stable low.
PM time shifts from "building the experiment" to "deciding which experiment matters" and "negotiating cross-team alignment."

Counter-evidence

Operators outside frontier labs may not have the brand-bar maturity, the model access, or the org buy-in to run this. Without the codified guardrails, automating stages 1–4 produces fast slop. The win is not in the automation; it is in the prerequisite of having explicit quality definitions.

Cross-references

Claude Code multiplies engineers 2–3x; PM and design become the bottleneck — the staffing implication
Give the model tools and a goal; do not hard-code the workflow — why the substrate exposes Claude with light scaffolding