Tech companies are increasingly under pressure to cut AI costs, and some are now testing whether cheaper, smaller models can do much of the work once reserved for the biggest systems without hurting quality. The shift could reshape the economics of AI, but it is also forcing companies to make difficult tradeoffs about which teams, products, and workloads get access to the most powerful models and which do not.
According to TechCrunch, rising token prices and fading subsidies are pushing users to rethink how much they rely on frontier models. The article says initial tests suggest cheaper models can sometimes replace more expensive ones with no measurable drop in quality when systems are designed carefully, and it cites legal AI company Harvey as a recent example of cutting inference costs by 3x by combining models and routing only the hardest tasks to a premium system.
That kind of optimization could have major consequences for the industry. TechCrunch notes that if companies can move large shares of their workloads to models that are dramatically cheaper, the biggest AI labs could see a financial hit just as firms like OpenAI and Anthropic are preparing for potential IPOs. Coinbase co-founder Brian Armstrong has predicted that “80% of workloads will be running on 99% cheaper models” within 12 to 18 months, leaving only the most demanding tasks on the latest-generation systems.
But the move toward cost savings is not frictionless. Business Insider reported that corporate efforts to trim AI budgets may create internal tension, with some employees or departments gaining access to premium tools while others are pushed onto less expensive alternatives. That kind of split could create what one report described as a “corporate caste system,” where access to better models becomes tied to status, budget, or strategic priority.
The pressure to economize is also visible among AI startups serving consumers. Business Insider reported that Inworld, an AI voice startup, cut its pricing by more than 50% as its chief executive, Kylan Gibbs, moved to help consumer AI companies cope with rising model costs and weak profitability. That reflects a broader reality for startups: lower prices may make AI products viable for more companies, but only if the underlying economics improve enough to support real margins.
For now, the big question is whether cheaper models will become the default for routine work or just a short-term fix. TechCrunch says companies could respond to cost pressure in other ways too, such as making fewer model calls, using less context, or abandoning some deployments altogether. The result is a fast-moving experiment across the industry: if cheaper models can keep quality high, they may become the backbone of everyday AI; if not, companies may continue paying a premium for the best systems and accepting the internal strains that come with it.