a builder's codex
codex · operators · Kesava Mandiga · ins_kesava-mandiga-cheap-tier-delegation

Cheap external model for grunt work; Claude only sees judgment

By Kesava Mandiga · Head of PMM, JustCall (SaaS Labs) · 2026-05-03 · thread · Kesava on cheap-tier delegation as routing pattern

Tier A · TL;DR
Cheap external model for grunt work; Claude only sees judgment

Claim

Claude Code's weekly limit is the constraint. The fix isn't a smaller Claude — Haiku still hits the same meter. The fix is a different provider for the dumb-pipe work. DeepSeek V4 Flash reads files, summarizes transcripts, generates boilerplate at $0.001 a call against a separate balance, and Claude only spends its limited tokens on the things that need judgment. Subagents don't help. Bash-shell-out to a non-Anthropic API does.

Mechanism

The two limits (token budget and weekly cap) are coupled today because everything routes through Anthropic. Decoupling them via an external dumb-pipe is the only architecture that touches the cap. Once decoupled, Opus-as-orchestrator becomes worthwhile because Opus stops being spent on grunt.

Open the interactive view → View original source → Markdown source →