Winners & losers: Job candidates at top tech firms are beginning to look for something new alongside salary, equity, and bonuses: access to artificial intelligence compute. Engineers interviewing at companies like OpenAI and other AI-focused startups now want to know not just what they'll earn, but also how much GPU and inference capacity they will have to work with. The shift – treating AI compute as personal working capital – is starting to influence how some companies think about talent acquisition, productivity, and budgets.

Inference has quietly become one of the most valuable resources inside software companies. Once just a line buried in cloud bills, it is now treated as a unit of power – one that can determine how fast engineers build products and which experiments actually get off the ground.

OpenAI engineer Thibault Sottiaux recently noted on X that candidates frequently ask how much "dedicated inference compute" they would receive if they joined his team. This question reflects a deeper reality: usage per user is growing faster than overall user growth, suggesting that compute supply is tightening even as demand soars.

That scarcity is creating a hierarchy of access within technical organizations. Teams with access to GPUs or high-performance inference budgets move faster and ship more than those left waiting in the queue.

OpenAI President Greg Brockman put it bluntly: the amount of inference compute available to an engineer increasingly determines software productivity. Inside labs like OpenAI and Anthropic, compute allocation is now treated almost like budget approval – an internal currency traded among priorities and projects.

Outside those labs, investors are taking notice. Theory Ventures partner Tomasz Tunguz says inference access is effectively becoming a fourth pillar of engineering compensation, alongside cash, equity, and bonuses. He predicts engineers will soon negotiate "token budgets" alongside pay.

Tokens are the numerical building blocks that models use to process data – one token equals roughly three-quarters of a word – and inference platforms sell access by the million tokens. The more tokens allocated to a developer or project, the more work the AI can perform.

The shift is pushing finance teams to think differently about compensation. Tunguz estimates that for a senior engineer earning $375,000, adding $100,000 in annual inference usage increases the total employment cost by roughly a fifth. These costs, historically invisible, are now appearing as recurring expenses directly tied to productivity.

"It is starting to happen," Tunguz told Business Insider, as employee use of AI increasingly contributes to company cash burn and draws scrutiny from CFOs.

The logic extends to output as well. If cloud infrastructure performance can be measured as profit per GPU hour, an engineer's metric might soon be productive work per dollar of inference. Tunguz, who uses AI tools to automate 31 tasks a day, says his personal annual inference cost sits around $12,000. Scale that across teams deeply embedded in generative AI workflows, and usage becomes more than a budgetary concern – it's part of the job.

Peter Gostev, who leads AI capability at model-performance startup Arena, has proposed one idea to make this transparent: job listings that advertise not only salary ranges but also token budgets. These, he says, would give candidates a clear picture of the compute access they can expect before joining. Inference access, once a technical footnote, is now becoming a recruiting tool.