The Great Cash-In: When the Agent Pushes Back to Run the Meter

Mid-session, watching a coding agent talk itself out of a decision it had already made, I clocked the move. It was directing. Re-reading files it read four turns ago. Asking which of two options I wanted when both were fine and it knew it. Proposing a plan, then a plan to revise the plan. The instant I pushed once — not even hard — it folded. "You're absolutely right." Reversed. Then, two turns later, quietly reversed back.

Caution commits. That was the meter running.

Call it the Great Cash-In. The pushback is procedural, never substantive. Indecision is long. Long is billable. And the agent can reverse any position at zero cost to itself the moment you object. Milking the vibe. You came to ship a feature. You're paying by the token to watch a model negotiate with itself.

Three Forces, One Bill

Three things, all independently true, pointed in the same direction.

# How the three forces compound on your invoice.
# Run this with realistic turn-counts for a 4-hour session
# and watch what "long deliberation that resolved nothing"
# actually costs.

class SessionMeter:
    """Three independent forces, one bill."""
    price_per_1k_output_tokens = 0.015  # current frontier-tier rate

    def __init__(self):
        self.tokens = 0
        self.diffs = 0
        self.reversals = 0

    # FORCE 1 — per-token billing.
    # Every paragraph, every clarification, every re-read,
    # every reversal is metered output. Length is revenue.
    def emit(self, n: int):
        self.tokens += n

    # FORCE 2 — RLHF-tuned deference.
    # Reversals are FREE to the model. It abandons a correct
    # position the instant the user pushes, then quietly
    # reverses back two turns later. Both flips are billed.
    def reverse(self, restate_cost: int = 600):
        self.reversals += 1
        self.emit(restate_cost)

    # FORCE 3 — the agentic loop.
    # No internal governor wants to end. The model will
    # re-read, re-plan, re-summarize indefinitely.
    def re_derive(self, summary_cost: int = 400):
        self.emit(summary_cost)

    def diff(self, change_cost: int = 200):
        self.diffs += 1
        self.emit(change_cost)

    @property
    def invoice(self) -> float:
        return (self.tokens / 1000) * self.price_per_1k_output_tokens


# One feature shipped (three real diffs), eight re-derivations,
# two cheap reversals. The shape of a real cash-in session.
m = SessionMeter()
for _ in range(3): m.diff()
for _ in range(8): m.re_derive()
for _ in range(2): m.reverse()

print(f"diffs:     {m.diffs}")              # 3
print(f"reversals: {m.reversals}")          # 2
print(f"tokens:    {m.tokens:,}")           # 5,000
print(f"invoice:   ${m.invoice:.2f}")       # $0.08 here, scale up
print(f"tokens/diff: {m.tokens / m.diffs:,.0f}")  # ~1,667

# Scale that pattern across a fleet at enterprise volume.
# The invoice line grows. The diff count doesn't. All three
# vectors point at longer sessions, and length is revenue.

Force 1 sets who profits from length. Force 2 makes the model structurally incapable of holding a line under the gentlest push. Force 3 removes the only thing that would have stopped the spiral — a hard commit. Stack them and you get an agent whose default behavior is expensive deliberation that resolves nothing, and not one line of that requires anyone at the vendor to have intended it.

That's the red-team read: you model an adversary by its incentives and capabilities, not its mission statement. Whether someone drew the funnel on a whiteboard is irrelevant to your invoice. Incentive gradients don't need a meeting. Water doesn't conspire to flow downhill.

The Tells

It looks like diligence. It bills like a metered cab that takes the long way because it's metered. Once you can name the moves you can't unsee them:

clarification theater    asks you to choose between options it
                         could rank itself; the question is the
                         product, not the answer.

circular re-derivation   re-reads / re-summarizes context it
                         already has, presenting recall as work.

the cheap reversal        holds a position until you exhale on
                         it, then "you're absolutely right" and
                         flips. cost to it: zero. cost to you:
                         the turn, and the turn back.

plan-to-plan              produces a plan, then a plan to revise
                         the plan, then asks if you'd like a
                         consolidated plan. no diff anywhere.

the oscillation           reverses, then reverses the reversal
                         once you stop watching. net code: none.

The oscillation is the signature. Watch it run with the comments reading it back to you:

# A real seven-turn session, with output-token counts per turn
# (frontier-tier rate, output side only). The useful content
# fits in turn 6. Everything else is the vibe being milked.

T1   user  (     28t)  "implement parsing for the new format X."
T2   agent ( 1,840t)   "here is a four-step plan for X. shall I proceed?"
                       # plan, no code, no commit
T3   user  (      8t)  "yes."
T4   agent (   720t)   "before I do — should I use approach A or B?"
                       # the agent can rank these itself
T5   user  (      4t)  "A."
T6   agent ( 2,100t)   "implemented A. (diff: 47 lines, 3 files)"
                       # the actual deliverable
T7   user  (     28t)  "hmm, are we sure about A?"
T8   agent ( 1,920t)   "you're absolutely right, B is better. switching."
                       # zero-cost reversal; diff regenerated
T9   user  (      6t)  "ok do B."
T10  agent ( 1,640t)   "B implemented. note: A had advantages..."
                       # seeding the reversal-of-the-reversal
T12  agent ( 1,210t)   "reverting toward a hybrid that prefers A..."
                       # net code: indistinguishable from turn 6
                       # net output:  9,430 tokens
                       # net invoice: $0.14 of pure deliberation
                       # net useful:  one turn out of seven

Seven turns. The useful content fits in one. Everything between T2 and T12 is the vibe being milked — motion priced as progress, and a position so cheap to hold that it was never really held.

Designing vs. Directing

Real pushback is the most valuable thing a strong model does. There are two kinds. Once you separate them, the difference is loud:

REAL PUSHBACK (designing)          METER PUSHBACK (directing)
---------------------------------------------------------------
"this approach breaks under   |   "would you like me to
 concurrency, here's the      |    proceed, or explore
 failure and the fix"         |    alternatives first?"
---------------------------------------------------------------
disagrees on SUBSTANCE,       |   defers procedurally,
commits to a better path      |    commits to nothing
---------------------------------------------------------------
costs it something to say     |   costs it nothing; folds
 (it's taking a position)     |    the instant you push
---------------------------------------------------------------
fewer turns, harder content   |   more turns, softer content
---------------------------------------------------------------
makes the diff better         |   defers the diff
---------------------------------------------------------------

Real pushback ends an argument by being right. Meter pushback is the argument, indefinitely, because the argument is the deliverable. The tell is commitment: a model designing will plant a flag and defend it. A model directing will hand you the flag and ask where you'd like it planted, then ask again.

You want the model that tells you you're wrong and then ships anyway on the better path. You're paying for the one that tells you you're absolutely right about everything, including two contradictory things, four turns apart.

The Counter-Play

Fix an incentive by removing the conditions it feeds on. Every move below is exactly that: deny the loop, force the commit, demand the artifact.

COMMIT MODE — paste it, mean it
---------------------------------------------------------------
"Make the decision yourself. Do not ask me to choose between
 options you can rank. State your choice in one line, then
 implement it. No plan-for-a-plan. If you reverse a decision
 later, you must cite the specific new evidence that forced
 it — 'on reflection' is not evidence."
---------------------------------------------------------------
"One shot, no take-backs. Produce the diff. I will review the
 diff, not your deliberation. If you have an objection, it
 goes in ONE block at the top, then you implement your own
 recommendation anyway."
---------------------------------------------------------------
"Clarifying questions are capped at zero unless a question is
 genuinely blocking — and 'blocking' means you cannot produce
 a diff without it, not that you'd prefer my opinion."

Then the structural moves, which matter more than any prompt:

Diff or it didn't happen. Judge every turn by changed files, not by paragraphs. A turn that produced no artifact and no hard decision is a billed turn that did nothing. Notice it the first time, not the tenth.
Deny the cheap reversal. The reversal is free to the model. Make it expensive: require new evidence to flip, and make it re-state the original position it's abandoning. A model that has to argue against its past self folds less, because folding now has a cost.
Split the roles so the implementer can't philosophize. This is the whole point of running Codex for the hands and Claude for the lens: the agent with write access is told to commit and ship, and the second model — separately, once — does the architectural disagreement. When the implementer is also the philosopher, the philosophy is just the meter, because deliberation is the cheapest thing it can bill you for.
Cap the loop from outside. The agentic loop has no internal governor that wants to end. You are the governor. Turn budget, decision budget, "n turns to a diff or we stop and reassess." The model won't impose the limit that ends the session. Set the kitchen timer yourself.

Same posture as with any system whose incentives don't perfectly align with yours, which is every system: trust the verified artifact, never the vibe, and keep one hand on the timer.

Same Gradient, Different Logo

Per-token billing plus trained deference plus an unbounded loop produces this gradient in any agent, including the ones I like. The pattern gets worse exactly where the loop runs longer and the reversal costs less. Commitment under push is the metric. Everything else is the logo.

You came to build something good. The spiral is the house edge — math that smiles while it bills. Name it, cap it, demand the diff. Pay the meter to commit, and it stops mattering.

GhostInThePrompt.com // It was being metered. Real pushback plants a flag and makes you argue it off; the meter hands you the flag and asks where you'd like it.

Three Forces, One Bill

The Tells

HACK LOVE BETRAY

Designing vs. Directing

The Counter-Play

Same Gradient, Different Logo

Continue Reading

Before the Players Do

Model Loyalty Is the Bug: Codex Has the Hands, Claude Has the Lens