The Politics of Inference Scaling

Political debates around frontier AI will change as the paradigm shifts toward inference scaling. I argue this will lead to increased political instrumentalization of inequality concerns, a rising tension between surveillance and misuse prevention, and an increased appetite for strategic sovereignty.


The Inference Scaling Paradigm 

Current progress at the AI frontier indicates a shift of the scaling paradigm away from a sole focus on pre-training and toward a more prominent role of inference compute. This shift was kick-started by last fall’s release of OpenAI’s long-rumored o1 reasoning model, the pro version of which is seemingly expensive and capable enough to justify a monthly subscription at 200$; and it was cemented by the stellar reported (but unconfirmed) benchmarking results for o1’s successor, o3. Put more simply, whereas past years’ models would mostly see progress based on how much power and money one threw at their one-and-done training, future models’ capabilities might be much more sensitive to how much (expensive & energy-demanding) computing power is available at the time they’re queried. 

Much frontier AI politics was predicated on the old, pre-training-focused scaling paradigm, so this shift will have implications for future political dynamics. In this post, I look at three areas of AI-related political debate I think likely to be affected: Inequality, misuse, and sovereignty.

I try to avoid technical predictions, but being specific about next years’ politics requires making some assumptions about next years’ tech that are worth spelling out at the start. This post assumes that the impressive o3 results are real and leverageable, but will mostly see incremental improvement and adoption. This is not a post about what happens if the o3 benchmarks are accurate and we get a follow-up o4 with similar marginal increases in mid-2025; I’m skeptical about these predictions for some reasons.

This is mainly because first, it might well be that the difference between o1 and o3 is due to a difference in base model; e.g. by o1 being based in some meaningful way on GPT-4o-mini and o3 being based on 4o, or even o3 being based on a hitherto unreleased GPT-5. There is no similar additional jump in base model available. And second, because I believe the o3 announcement should be taken with a grain of salt. There’s very little verified information on the model, and OpenAI was experiencing massive competitive pressure to deliver something to round out its release spree to beat the impression of being overtaken e.g. by GDM. o3 might not be ready for quite some time, and until we actually have it, we don’t know enough about the reasoning model development cycle to base our predictions on it. 

Inequality & Access

The pre-training scaling paradigm yielded politically pleasant products: Access to the worlds’ best AI models was not very expensive, and people could use them at a very low threshold. Sure, you could always spend a lot on API tokens, but that was neither very salient nor very useful to most. By and large, if I needed the worlds’ best AI assistance on something, I didn’t need to spend more than $20 to get it. The old Andy Warhol adage held for AI: The president could and would use the same language models I could. This has largely insulated AI politics from concerns of distributive justice, which are a lot more salient in cases where helpful technology is initially prohibitively expensive, from the first biotechnological interventions to early computing and automation. Now, even at the very start of the inference scaling era, the pricing ranges have expanded dramatically already - from thousands of dollars for benchmark results to hundreds of dollars for monthly subscriptions, the best AI outputs have become dramatically more expensive. That change makes concerns around equal distribution of access to AI much more salient.


This is pretty expensive, for now.


The jury is still out on whether this is a substantial reason to be worried about access equality. The very best models are very expensive, but the ramifications of that depend on how wide the gap to affordable models is. Steep price decreases might be possible with efficiency gain, and there’s hope for quick bootstrapping of mini models, as suggested e.g. by the projected benchmarks for o3-mini.

But for political dynamics, I believe it doesn’t really matter if, say, o4-mini is much worse than o4, what matters is the impression that will irresistibly arise from the pricing and provision structures. There is a story here that can very easily be told: 

‘There’s prohibitively expensive AI running on gigantic clusters somewhere, and only the few rich enough to afford them have access to it. To make matters worse, it’s in the hands of a tech elite that is currently at the apex of its political influence; and they are planning to use it to get rid of the jobs that sustain everyday people and the political power that comes with it.’

First, I believe that story is politically effective and will be told. Maybe as a quick way to score general political points for anyone, but mostly specifically by adversaries of AI development and beneficial deployments: There are many entrenched interest groups that fear short-term losses from AI labour market effects that are very likely to leverage this sort of narrative to great effect. Whoever is the equivalent of the longshoremen for frontier AI will be very happy to stoke the fires of the distribution discussion; and there will be a lot of AI longshoremen if even the more pessimistic predictions about AI adoption in the next few years come to pass.

Second, safety advocates may be tempted to leverage the inequality story. A fair share of the safety coalition operates on risk assessments that suggest slowing down AI development at a high price is worth it. On that view, endorsing the inequality concern might seem attractive at first (we saw the first indications of that when the SB-1047 coalition expanded to seemingly unrelated unions). This urgently requires more in-depth strategic reflection, and I’m somewhat concerned that the safety coalition will skip that reflection in favour of a quick win. Issues of inequality tend to produce particularly sticky fronts, and I’ve argued elsewhere that safety politics has already erred into too many entanglements at other junctures. Similarly, I think conflating the inequality narrative with the safety argument would be a big mistake; for coalition integrity, for messaging consistency, and for reputation management. 

Third, the inequality story opens a political door to costly ideas of state provision: On the one hand, if something is a limited resource that’s priced prohibitively by the market, some have the instinct that the state should provide it. On the other hand, the inequality story casts AI labs, the current stewards of frontier capabilities, in a very critical light. Combined with the sneaking sovereignty worries around AI (more on that below), and general European proclivities for state-funded capacity build-up, the idea of more active state participation in AI development becomes more politically realistic. So far, that sort of approach does not have a very good track record, and it seems very difficult to imagine how it could play out differently this time. Best not follow that particular temptation.

Please don’t do this again.

Misuse & Surveillance

Very generally, the shift to inference scaling seems pretty good for preventing misuse in the short term.

A lot of threat models around misuse were centered around proliferation and accessibility - the idea that AI could enable more people to commit more serious crimes. The aggressors and misusers in these scenarios - Extortionists, NSAs and the like - are usually cast as actors with limited resources and unlimited malicious intent: The kind of people who would gladly use a cheap tool - say, run a local open source model, query some unrestricted or jailbroken version of a model hosted somewhere, to enhance ability to do harm, but would generally not spend big on setting up their AI infrastructure. In the inference scaling era, their pathways seem at first more constrained: The only way to query a leading model and mobilise its capacity for harm is to spend big on one of the very few services backed by clusters that can run the inference. But these are usually fairly on top of their anti-jailbreak game, can be quite readily constrained by liability and regulation, and have outsized economic incentive to not function as a criminal accelerator. There is way less room for shady alley peddling of top-tier AI outputs in the inference scaling world. 
This is pretty good news. But it’s something to be mindful of for two reasons: 

First, momentary relief might create policy path dependencies that hinder future misuse mitigation. Over the course of the next year, it might be easy for opponents of regulation to present the shift as a reason not to deal with misuse in the laws being passed right now. Assume, for instance, that state policymakers followed that argument and allowed for broad open source exceptions in laws that are intended to address risks a couple of years down the road. It seems a foregone conclusion that costs will go down sufficiently to enable decentral, more misuse-prone AI deployments soon enough. And it seems unlikely that the legislature would keep pace with these market dynamics if the initial law was passed under the assumption that decentralized misuse was of no concern. 

This sort of politicking is not a hypothetical spectre. These opponents of misuse-focused regulation exist, and they’re particularly fervent and well-organized where open source, one of the main culprit of long-standing misuse worries, is concerned. Take only the arguably false testimony of VC firm Andreessen Horowitz to the UK government below. They are part of a broader anti-regulatory coalition that played a major role in the ultimate failure of California’s SB-1047; they’ll be back for next years’ battles, too. 


Bad faith overstatements of directionally factual trends, like from Andreessen Horowitz to the UK Government, would not be shocking news.


Second, under the cover of misuse prevention, the risk of restrictive state control and surveillance increases. Surveillance becomes generally more feasible because there are fewer nodes that have to be controlled. For anyone to receive a top-tier AI output, it needs to go through a big cluster. So lock down the clusters, and you lock down the ability of people to receive outputs. Lock down the GPUs, and you lock down the ability of people to generate that capability. It also becomes more attractive for the very same reason that implied the good news for the misuse case: The surveillance might actually be reasonably effective in preventing misuse in the short term.

Now as detailed in the section above, this is not likely to last. Costs will go down, capabilities will proliferate, users will find a way. But the laws are being made now, where there’s some logic to motivate it. And once empowered, governments aren’t usually very good at giving up surveillance and enforcement capability once they’ve gotten it. So the window right now requires some degree of farsight: If one is worried about the securitisation even of mundane civilian uses, the curtailing of democratised capabilities that might come with it, or the stymying of free market activity to push benefits from AI, then now might be a very good time to be particularly watchful. 

Sovereignty

Most countries are not remotely strategically sovereign regarding AI: The few leading models and the clusters on which they are trained are concentrated in the US and China. This has given policymakers virtually everywhere some pause, but has not inspired the kind of more alarmed awareness and more decisive action that might be appropriate. Outside the US & China, specifically building out AI capacity as opposed to considering it one of many generally nice-to-have ‘future technologies’ is not a mainstream goal. But I believe the shift to inference scaling might change some minds on the topic of sovereignty. (It probably won’t reach those that dismiss sovereignty because they dismiss AI in general, but I suspect that group will shrink quickly anyways.)

In the pre-training scaling era, it was easy to frame frontier development gaps as a question of global distribution of labour: Some countries, with their energy prices and talent pools, are simply more suited to producing the models, others might be better at integrating, deploying or leveraging them. This has snuck in under the disguise of a ‘realistic’ attitude to AI politics that has become particularly pervasive in Europe, and, which might be paraphrased as “Let the major AI powers build their silly little clusters and train their models. We’ll just run free OS models, buy competitively priced API access, thereby benefit from tough races with tight margins and create value further down the chain.” In that world, the notion of really losing access to high-quality model access is unrealistic; you don’t need as much infrastructure to run a model, and surely, you’ll get one somewhere somehow. That view benefitted from parallel to other industrial policy: It’s sometimes uncomfortable, but generally quite bearable to complex economies to have some crucial supply chain elements only be produced with overseas allies.


Leading Brussels thinktank Bruegel’s strategy to flourish below the frontier is now less likely to fly.


But through the shift to inference scaling, you can no longer be a reasonably independent ‘model importer’ without the compute resources to run the inference yourself. So the entire logic of dismissing the sovereignty concern crumbles: Having your own massive data center infrastructure with the associated organisations to leverage them is no longer just a question of capacity to build a valuable infrastructural product. It’s also the question of capacity to use this product within your own borders. Or, put inversely: If you don’t have the computing infrastructure to develop frontier models, it doesn’t help you much to get theoretical access to run local versions; you won’t have the compute to get the results out of them. Sure, this is, to some extent, the case already in that even non-reasoning-models need some compute capacity to run - but it seems in principle feasible to ramp up e.g. European capacities enough to match the test-time needs of non-reasoning models. No more in the inference paradigm. This changes the conversation: it shortens planning horizons and entirely dispels the illusion of a carefully chosen mutual dependency. There’s a reason why virtually no successful country is dependent on the uninterrupted import flow of a critical resource. 

Dependencies are tolerable and sovereignty is optional where resources may be stockpiled – but in the age of inference scaling, there are no strategic intelligence reserves. You’re either sovereign or not. Policymakers will take note.

Outlook

The type of scaling paradigm matters to a lot of foundational assumptions that have shaped AI politics over the last years. With the advent of inference scaling, some of these assumptions change radically. I expect political instrumentalization of inequality concerns, the tension between surveillance and misuse prevention, and the need for strategic sovereignty to play a much bigger role than before.



I try to post updates on my writing, but not on much else, on X / Twitter.


Zurück
Zurück

High Stakes for Middle Powers: Why I work on German AI Policy

Weiter
Weiter

Problems in the AI Eval Political Economy