GitHub just switched Copilot to metered billing, and developers are watching months of credits vanish in a single day

The token bill is due

By Skye Jacobs June 3, 2026, 9:18 28 comments Add TechSpot

GitHub just switched Copilot to metered billing, and developers are watching months of credits vanish in a single day

Serving tech enthusiasts for over 25 years.
TechSpot means tech analysis and advice you can trust.

Bottom line: GitHub's move from flat-rate "requests" to metered usage is forcing many developers to confront something they had largely ignored: how many tokens their everyday coding habits consume and what that usage actually costs. As the new credit-based pricing exposes the expense of long chats, large context windows, and frontier models, many are rethinking how – and how often – they rely on AI in their day-to-day work.

Back in April, the company said it would move all Copilot plans to a usage-based system that bills users based on actual AI consumption, measured in tokens, starting June 1. Under the old setup, subscribers worked from a pool of "requests" and "premium requests," whether they were asking a quick question or letting Copilot grind away for hours on a complex refactor.

GitHub said that model meant the service was absorbing "much of the escalating inference cost" from heavy users. That cross-subsidy is now over. Instead, users are faced with a meter tied directly to the size of their prompts, the length of Copilot's responses, and, crucially, the model they choose.

Credit: twhoff / Reddit

On paper, the new pricing structure looks simple enough. Each paid plan comes with a bundle of AI credits, with one credit representing one cent of usage. The $10-per-month Pro tier includes 1,500 credits, or $15 worth of AI usage. Pro+ costs $39 per month and includes 7,000 credits, while the top-tier Copilot Max plan costs $100 and comes with 20,000 credits, equivalent to $200 in usage.

The catch is that those credit pools can disappear at dramatically different rates depending on whether users stick to lightweight models or rely on the largest and most expensive ones.

That disparity becomes clear when comparing models. One million output tokens from a smaller OpenAI model such as GPT-5.4 nano costs about $1.25 through Copilot. The same volume generated by the frontier-class GPT-5.5 costs roughly $30.

For developers who typically use the default settings, that difference was easy to ignore when everything consumed a single premium request. Under usage-based billing, however, the same "let's see what it does" approach can burn through a month's worth of credits in just a few sessions.

The impact is already visible in figures users are sharing. A simple "build a Minesweeper game" prompt run through Claude Haiku 4.5 via Copilot consumed about 94 credits. For a toy project, that may seem manageable. But when users move from toy projects to production workloads, consumption can rise quickly.

One person reported a single complex prompt consuming 171 credits. Another said that "a few prompts" used up 700 credits. In one case, a couple of Copilot-driven commits consumed 5,000 credits – a full quarter of Copilot Max's monthly allowance.

Even ostensibly routine work is proving more expensive than some developers expected. Users have complained about spending 15 credits on what they describe as a "run-of-the-mill query" or 100 credits to generate a small plan.

One user said they were "super cautious on the first day," limiting their experimentation with Claude Sonnet 4.6, yet still spent 840 credits.

Another, after watching 21 percent of their Pro credits disappear in a single day, concluded: "I have a feeling I'll be going somewhere else pretty soon."

Some users are treating the new pricing model as a nudge to tighten up their workflows. Developer Henri Kinnunen said they used only 161 credits during a productive day by making "very focused and deliberate changes with AI" while using GPT-5.3-Codex.

Others are revisiting long-running habits that made sense when token usage was effectively invisible. On Bluesky, developer Neil Hewitt pointed out that keeping a three-day chat thread alive means sending the entire conversation back as context with every request. Those past messages all count as input tokens, and input tokens now have a direct cost. As he put it, "input tokens use credits… it's not rocket science."

The backlash has prompted some users to explore alternatives with less aggressive pricing, at least for now. One developer described integrating DeepSeek into a GitHub and VS Code workflow and estimated the cost at "about 7 cents for 15 million tokens" – a figure that illustrates just how wide the pricing gap can be between providers and models.

At the same time, there is a growing sense that Copilot's move may not remain an outlier for long. If other AI coding assistants follow the same path, the era of flat-fee, "all-you-can-eat" access could be coming to an end.

For developers, that shifts the conversation from "What can this model do?" to "What is this task worth?" Token efficiency, context management, and model selection – once abstract concerns – are becoming operational decisions with real budget implications. The challenge is no longer just getting the best answer, but getting an answer that is good enough without quietly burning through next month's AI budget in the process.

See more TechSpot in Google Add us as a preferred source and our reporting shows up first when you search.

Add TechSpot

// Related Stories

Featured on TechSpot