A previous post on this blog treated “model multipliers” and “usage-based billing” as if they describe the same cost problem. A sharp comment on LinkedIn pointed out they do not. This post sets that straight, including one insight that changes the sub-agent picture entirely.

What changes on June 1

Starting June 1, 2026, most Copilot subscribers move to usage-based billing:

  • Each interaction consumes tokens, priced per model
  • Your plan includes a monthly AI credits allowance
  • Exceeding your allowance means paying for additional credits at published per-token rates
  • Code completions remain unlimited and are not affected

The multiplier system only continues for Copilot Pro and Copilot Pro+ subscribers on existing annual plans who stay on request-based billing, for example due to contractual obligations. For everyone else, the multiplier is gone on June 1.

The cost comparison that actually matters

On usage-based billing, the per-token rates for Claude models inside Copilot are close to Anthropic’s standard API rates. That leads to two observations:

  • Using Claude Opus in a Copilot Agent session costs roughly the same as using the Claude API directly
  • It is not comparable to a Claude.ai Pro subscription, which is a flat monthly fee — for heavy users, the flat subscription wins
  • On request-based billing, large implementation tasks can be cheaper than usage-based billing depending on token consumption — this only becomes clear once you have real usage data under the new model

The sub-agent cost flip

This is where usage-based billing changes things most significantly, and it is easy to miss.

Under request-based billing:

  • The price of an agentic session is determined by the highest-multiplier model used anywhere in the agent chain
  • If you have 20 sub-agents and one uses Opus, the entire session is billed at the Opus rate
  • Cheaper models in other sub-agents do not affect the price
  • There is no financial incentive to optimise individual sub-agent model choices

Under usage-based billing:

  • Every sub-agent’s token consumption is billed individually
  • Using Opus across all sub-agents now costs significantly more than mixing in lighter models where heavy reasoning is not needed
  • Sub-agent model selection suddenly matters a great deal

For anyone running complex agentic workflows with multiple sub-agents today, this is the most important change to understand before June 1.

The practical takeaway

Moving reasoning work out of VS Code into a separate tool still makes sense, but the reason differs by billing situation:

  • Usage-based billing: compare the per-token cost of a frontier model in Copilot against a flat Claude.ai or ChatGPT subscription, and start thinking about which sub-agents actually need a powerful model
  • Request-based billing: ask whether spending your monthly quota on reasoning tasks is worth it versus saving it for work that needs codebase context

Check your Copilot usage dashboard before June 1. GitHub is running a usage preview through the end of May so you can see your projected consumption before the switch happens.


Discover more from think about IT

Subscribe to get the latest posts sent to your email.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Post Navigation