The team's data flywheel produces monthly model updates with measurable improvement.

The economic moat in AI is post-training on proprietary data, not pre-training a base model

Claim

Beyond ~30B parameters, the capex of pre-training your own base model no longer makes economic sense. The defensible asset shifts to post-training: fine-tuning, RAG, reward design, and proprietary feedback loops on data you uniquely capture. Cursor is the canonical proof — its $300M ARR moat is the data of which suggestions users accept and reject, not its IDE chrome.

Mechanism

Pre-training capex scales with model size; revenue from a self-trained base model rarely justifies the spend once a frontier provider exists. Post-training is the cheaper, defensible layer: your fine-tuned model behavior is a function of your accumulated user interactions. As long as you keep collecting interaction data, the model gap to competitors compounds. Pre-training is a one-time spend in a public market; post-training is a moat that compounds in your private one.

Conditions

Holds when:

You have a UX surface that produces ranking-worthy interaction signals (accepts/rejects, ratings, follow-ups).
You can run the post-training loop frequently — synthetic data generation, eval design, RAG updates.

Fails when:

The product surface produces sparse or noisy signals. Garbage data degrades fine-tunes.
The base model improves so quickly that your fine-tune becomes redundant. Watch for capability waves that obsolete your customizations.

Evidence

Asha cites Nathan Lambert's research: "Once a model hits 30B parameters, the CapEx to pre-train doesn't make economic sense; you should fine-tune instead."

"50% of developers are now fine-tuning. When you go through the full loop — synthetic data generation, rewards design, A/B testing rigorously, extract job-to-be-done — you get better results faster."

— Asha Sharma on Lenny's Podcast, 2026-04-28

Cursor's moat is named explicitly: data from accepted vs. rejected suggestions, retrained continuously.

Signals

The team's data flywheel produces monthly model updates with measurable improvement.
Fine-tuning is on a normal release cadence, not a research project.
The product gets better the more it's used, in measurable axes (suggestion acceptance rate, time-to-task, etc.).

Counter-evidence

Benjamin Mann at Anthropic argues the foundation-model layer is still where the most compounding happens — "the model will eat your scaffolding for breakfast." Sherwin Wu echoes: customer fine-tunes can be obsoleted by the next base model. The post-training moat exists for current generation; the pre-training disruption can collapse it overnight.

Cross-references

Treat the product as a living organism with a metabolism, not a shipped artifact — the broader framing this moat lives inside
Plan in seasons keyed to secular changes, not 6-month roadmaps — the cadence implication