Moving to Substack
My future writing will only be available on Substack. This section will remain intact for archival purposes, i.e. because I spent way too much time optimising the aesthetics of these posts to migrate them to Substack without leaving a copy.
High Stakes for Middle Powers: Why I work on German AI Policy
Working on the frontier AI policy of middle AI powers like Germany is unterrated: A lot of AI benefits are contingent on medium power policy, their successes and failures can quickly cascade, and middle power voices matter greatly on global governance.
In the pursuit of superlinear effects, caring about the frontier AI policy of middle AI powers like Germany is underrated: A lot of AI benefits are contingent on medium power policy, their successes and failures can quickly cascade, and medium power voices matter greatly on global governance.
Why I work on German AI Policy
I spend some of my time working on German frontier AI policy. That goes against the strong trend of AI policy folks seeking the largest jurisdiction in pursuit of superlinear effects of their work. So sometimes, I get asked why I spend time on Germany. I think some of my responses are broad enough to matter even beyond Germany’s borders. I argue:
Policy contingency of AI benefits is higher in middle powers. Policy work matters a lot.
Beyond its own borders, Germany will likely serve either as a blueprint or a warning tale
Germany is highly relevant to some important international dynamics. Without good policy work, it could mess them up.
AI policy work in middle powers can insulate against failure modes and to shape reactions to external wake-up moments. Policy organisations and the many large AI diasporai hailing from middle powers could take note.
Most People Don’t Live In The US
Frontier AI policy talk nowadays is largely a tale of two cities: There’s the US as a principal nexus, hosting all frontier labs, the leading manufacturer, most major policy organisations, and holding most of the cards when it comes to comprehensive regulation of model development. There’s also China, viewed as a main adversary, treated as a central motivator for many different maximalist demands from open-source crackdowns to all-out racing and prohibitive export controls. There’s also the UK’s impressive efforts to secure the third spot on the podium – but beyond that, prevailing opinion holds that no policy professional worth his salt should spend much time working on other national jurisdictions. But a lot of people live in other countries, too. Many of these are what I’d consider middle AI powers - economies with some AI presence and plenty of AI compatibility, but no current plcae at the frontier. These countries and their governments might not all have an outsized effect on how the development of frontier AI shakes out, and so it’s quite justified for a lot of people to work on US policy. But the policies of other nations still have a substantial effect on how this development affects their respective citizens – which do make up a majority of the world’s population.
In fact, I suspect that many practical outcomes of AI progress for countries that are not global leaders is far more contingent on successful policy, because they are not the default ground zero for any avenue of adoption. By successful policy, here and throughout this piece, I try my best not to import any value judgement: I mean neither particularly AI-safety-focused policy nor particularly hands-off dynamistic policy, but simply policy that sufficiently takes into account the recent and likely future rapid trajectory of AI capability growth.
For instance, by default, the US population will probably somehow partake in the everyday benefits of adopting US frontier labs’ models: Invariably, US models will be released and deployed in the US. For a country like Germany to benefit from US-driven AI progress, a lot more policy has to go right: They need to have continued access to them and must not lose it via geopolitical conflict, overregulation or affordability constraints due to economic downturn. The same argument applies e.g. to the security umbrella provided by non-civilian use – which is by default guaranteed to the US, but will only extend across the Atlantic under specific circumstances. Or to the availability of computing power to become sovereign or at least run your own inference on – which will presumably be provided in the US, but could well be strongly constrained elsewhere.
A recent, much-read piece published by the Hoover Institution makes all that abundantly clear, where even meaningful participation of Five Eyes countries is framed as highly uncertain. To be clear: It should not be taken for granted that US-aligned allies get unfettered access. This is not a matter of demanding access or railing against a bullying Trump administration — it’s a matter of making grown-up policy choices and leveraging strenghts to make the kind of participation we want mutually beneficial.
Many more examples come to mind, which are all to say: Life as a middle AI power is set to be acutely difficult and will require genuinely good policy work.
Germany is one of these middle powers, of course, and with 82 million people, the world’s third largest economy and fourth highest industrial output, there’s simply a substantial share of ‘the world’s activity’ here. Even by virtue of that, I’d think the way that frontier AI policy is done there should matter to some extent. (And of course, similarly, what is done in similar countries, too.) But my argument goes beyond that, too.
Quick Aside: Why not EU?
You might think that this is all fine and true, but could be addressed by EU-level action and policy work, where the EU AI Office has gathered a strong suite of technical and policy talent. I think the EU AI Office does great work on everything it has a mandate for, but I’m skeptical that this mandate will expand to next years’ frontier AI politics as they relate to Germany (and likely France and Italy). That is for broadly three reasons:
First, insofar as AI systems become increasingly relevant to ‘hardcore’ issues of national security, the EU will be consulted on them less due to lack of confidentiality & lack of loyalty to idiosyncratic national security interests.
Second, insofar as AI systems become increasingly central to economies and societies, they and their regulation are likely to become more politicized – which is to say there will be greater national political interest to influence them directly and less incentive to turf them to a more technocratic, but less political EU.
Third, between deeply unpopular legislation and purported chilling economic effects, some national governments are already concluding that the EU has been given too much leeway and hasn’t used it too well. This sentiment is further enhanced by a perceived overall ‘vibe shift’ supercharged by an incoming Trump administration and its threat to back US tech corporations against EU regulation. When given the choice, major national governments might choose to do their frontier AI policy inhouse instead.
Aspirational and Cautionary Tales
Beyond Germany’s borders, its frontier policy matters in two further ways that apply to Germany more than to some other middle powers. The first relates to aspirational and cautionary tales. Among the middle AI powers, Germany is one that many smaller countries will look at - the jury’s still out as to whether that will be in imitation or rejection. If done well, a strong and successful adaptation to AI progress could serve as a blueprint for the actions of other middle powers, effectively scaling up German policy wins at least throughout the group of similar-ish countries in which a lot of the world’s GDP and industrial capacity still lies. The UK already offers one such blueprint – on how to participate from a position of AI-specific strength, provided world-leading talent and a technology sector that largely outperforms the rest of an ailing economy.
Germany could offer another blueprint: On how the integration of AI can help leapfrog an atrophied digital environment, on how it can be integrated into a predominantly industrial economic model, and perhaps by a power that’s just a little bit less entirely aligned with the US. In many senses, Germany is well-positioned to do so. First as an economy: For pre-AI-rise levels, it had decent data center capacity; it has a comparatively highly industrial economy that might be a less immediately susceptible to near-future AI labour market impacts and a lot better at leveraging these systems’ benefits; and it still produces world-leading research contributions on many levels that would stand to benefit from AI-based acceleration. And second as a political environment: It has the theoretical overall state capacity to pull off decisive moves, the policy environment and associated political stability to be receptive to outside advice, and the political window to take decisive action this year. If we find good policy interventions – and make good choices where not to intervene –, the findings might be worthwhile elsewhere.
But Germany could also fail to adapt to AI developments. It could keep squabbling about its many other economic and political problems, largely inherited from the last decade or two, and only find the capacity to adapt to a new wave of changes once it’s already washed over its economy and society. It might push the country into surrendering its aims of sovereign competitiveness, push it further into blocks and geopolitical subservience; encourage political backlash and suggest pessimism and luddite techno-skepticism upon the prospect of having no path to maintain comparative standing in a world with ubiquitous advanced AI, etc. I don’t mean to overstate how much other countries care about Germany, but still: the global perception of a country with as strong a starting position faltering at that adaptation could also have ripple effects: Disabuse slightly weaker middle powers of their plans of sovereignty, invite them to blindly align with one geopolitical master or another, or encourage the same sort of political backlash. In short, it seems valuable to provide models for middle power successes, and it seems risky for a well-positioned middle power to falter and fail. Germany is one rare place where policy work might affect which of these stories will be told.
Internationally, Germany Is Somewhat Influential Still
Last, Germany still is a major voice in somewhat important fora: It’s one of the top NATO contributors, a de-facto co-leader of the EU, one of the G7 and one of the most well-positioned G20 members. In these roles, Germany can be quite impactful, from seriously endangering a lot of reasonable international policy to being a genuine driving force for good compromise.
Seriously endangering international progress is not all that unlikely: The fora listed above are some of the precious few avenues to ensure meaningful international cooperation on critical issues, from non-proliferation of dangerous capabilities to limits on military uses and more. Many of the very sophisticated international frameworks that have been suggested would be in danger if Germany wasn’t up to speed on AI. For a recent example: The EU AI Act almost got derailed by a lobbying push that hinged on motivating the German government to protect a nascent frontier model development sector that did not exist in any relevant sense. That plan almost succeeded due to lack of state capacity and expertise on frontier AI. To be clear, I think there were reasonable arguments to refuse the current version of the AI Act on economic grounds, as were provided e.g. by France – but that is emphatically not the conversation that happened in Germany. No matter what you think about the AI Act: If you care about international agreements, you probably don’t want a major international player to be far enough from the policy frontier to be this susceptible to shallow motivated campaigns from any interest group.
If things changed, Germany’s diplomatic position might very well be an asset to global AI policy. Very generally, Germany has historically performed a conciliatory function between power blocks before, e.g. in the late 70s and 80s between the USA and USSR in relation to nuclear weapon stationing and in the 2010s with regard to Russia. Today, for a western-aligned country, its relationship and economic integration with China is fairly strong, to an extent that could make it an important player and intermediary in some scenarios of high-stakes diplomacy around frontier AI development, deployment and military use. Even if it were my field of expertise, making more precise predictions around the geopolitical dynamics around upcoming AI diplomacy would be difficult given the uncertainties around the technical trajectory. For now, I simply want to say: Germany could matter here. If it does, it’s probably much better if it has well-informed frontier AI policy.
The Situation In Germany
This is not a post about German AI policy, per se. But to give some ideas on what actions all the arguments above might imply, I still provide a very short outlook of the kind of challenges it’s facing.
On policy, the only legislation concerning frontier AI so far is an early draft on EU AI Act implementation – i.e. a bill chiefly concerned with downstream enforcement of strategic considerations made elsewhere that largely only affects consumer market issues. Neither broader government strategy documents nor the plans outlined in the party platforms for the upcoming election next February suggest any change to that. It would be no exaggeration to say that there is no frontier AI policy originating from Germany. But that is partly because parties have only recently gotten interested in addressing the issue on a national level – and I’d expect at least a more prominent role for AI in the next administrations’ economic policy, and maybe some reactive pieces of frontier AI legislation depending on external shocks.
On capacity, the German civil service has long-standing issues with attracting young, dynamic talent that do not stop with AI. Germany’s broader ecosystem for political advice is similarly calcified, with a lot of incumbency-based structures and few pipelines for policy advice keeping the government more insulated from disruptive advice than elsewhere. But recent plans to relax and reshape structures around digital policy, as well as some breakthrough advisory organisations, have begun to upset this at least a little.
On clustering, there is a lot of latent category errors happening: Predominantly, AI is either understood as part of a big cluster of broadly relevant ‘future technologies’ one ought to be excited about; as part of a set of digital issues that also include social media platforms and state digitisation; or as part of a set of civil liberty threats that also include narrow AI systems from facial recognition to predictive algorithms.
Standalone awareness of the issue is exceedingly rare and becomes rarer still because these misconceptions shape the ecosystem. There are parliamentarians and organisations for each of these issue clusters, and each of them are eager to claim jurisdiction over AI based on their idiosyncratic categorization – and with limited general political interest, appreciation for the singular character of recent AI progress fails to materialize.
What’s Next?
Based on all of that, my model for the future of German frontier AI policy is not that a concerted push will lead to a meaningful shake-up in the short term. I am more confident that latent presence can prevent a lot of the genuinely bad effects I outlined above. Beyond that, I think the main impact lies in being ready for the external wake-up call. I am confident that critical moments are coming up quickly, whether they’re prompted by really good AI news or really bad AI news: at some point, awareness will reach the upper political echelons.
And at that point, I would like people to be ready: Ready with good ideas, ready with established channels of communication, ready with an established track record of caring about what happens in these middle powers. To my mind, that’s the best way to ensure that whenever they do happen, middle powers’ reaction to AI progress are shaped by a realistic view of what is happening, and aren’t susceptible to paralysis or overreaction.
In very practical terms, that opens some avenues for participation that I wish were better traveled and that I suspect are similarly open to other middle powers as well: The troves of highly qualified diaspora Germans can reach out back home and start engaging - I suspect similar groups exist for many other middle powers, too. The landscape of policy organisations could extend much more prominently to the national politics of middle powers. And the broader funding ecosystems could take engagement in these countries into account. The frontier AI policy of Germany – and its fellow middle powers – matters a little bit more than most people think.
I try to post updates on my writing, but not on much else, on X / Twitter.
The Politics of Inference Scaling
Political debates around frontier AI will change as the paradigm shifts toward inference scaling. In a broad overview, I predict political instrumentalization of inequality concerns, a rising tension between surveillance and misuse prevention, and an increased appetite for strategic sovereignty.
Political debates around frontier AI will change as the paradigm shifts toward inference scaling. I argue this will lead to increased political instrumentalization of inequality concerns, a rising tension between surveillance and misuse prevention, and an increased appetite for strategic sovereignty.
THIS BLOG HAS MOVED TO SUBSTACK
The Inference Scaling Paradigm
Current progress at the AI frontier indicates a shift of the scaling paradigm away from a sole focus on pre-training and toward a more prominent role of inference compute. This shift was kick-started by last fall’s release of OpenAI’s long-rumored o1 reasoning model, the pro version of which is seemingly expensive and capable enough to justify a monthly subscription at 200$; and it was cemented by the stellar reported (but unconfirmed) benchmarking results for o1’s successor, o3. Put more simply, whereas past years’ models would mostly see progress based on how much power and money one threw at their one-and-done training, future models’ capabilities might be much more sensitive to how much (expensive & energy-demanding) computing power is available at the time they’re queried.
Much frontier AI politics was predicated on the old, pre-training-focused scaling paradigm, so this shift will have implications for future political dynamics. In this post, I look at three areas of AI-related political debate I think likely to be affected: Inequality, misuse, and sovereignty.
I try to avoid technical predictions, but being specific about next years’ politics requires making some assumptions about next years’ tech that are worth spelling out at the start. This post assumes that the impressive o3 results are real and leverageable, but will mostly see incremental improvement and adoption. This is not a post about what happens if the o3 benchmarks are accurate and we get a follow-up o4 with similar marginal increases in mid-2025; I’m skeptical about these predictions for some reasons.
This is mainly because first, it might well be that the difference between o1 and o3 is due to a difference in base model; e.g. by o1 being based in some meaningful way on GPT-4o-mini and o3 being based on 4o, or even o3 being based on a hitherto unreleased GPT-5. There is no similar additional jump in base model available. And second, because I believe the o3 announcement should be taken with a grain of salt. There’s very little verified information on the model, and OpenAI was experiencing massive competitive pressure to deliver something to round out its release spree to beat the impression of being overtaken e.g. by GDM. o3 might not be ready for quite some time, and until we actually have it, we don’t know enough about the reasoning model development cycle to base our predictions on it.
Inequality & Access
The pre-training scaling paradigm yielded politically pleasant products: Access to the worlds’ best AI models was not very expensive, and people could use them at a very low threshold. Sure, you could always spend a lot on API tokens, but that was neither very salient nor very useful to most. By and large, if I needed the worlds’ best AI assistance on something, I didn’t need to spend more than $20 to get it. The old Andy Warhol adage held for AI: The president could and would use the same language models I could. This has largely insulated AI politics from concerns of distributive justice, which are a lot more salient in cases where helpful technology is initially prohibitively expensive, from the first biotechnological interventions to early computing and automation. Now, even at the very start of the inference scaling era, the pricing ranges have expanded dramatically already - from thousands of dollars for benchmark results to hundreds of dollars for monthly subscriptions, the best AI outputs have become dramatically more expensive. That change makes concerns around equal distribution of access to AI much more salient.
The jury is still out on whether this is a substantial reason to be worried about access equality. The very best models are very expensive, but the ramifications of that depend on how wide the gap to affordable models is. Steep price decreases might be possible with efficiency gain, and there’s hope for quick bootstrapping of mini models, as suggested e.g. by the projected benchmarks for o3-mini.
But for political dynamics, I believe it doesn’t really matter if, say, o4-mini is much worse than o4, what matters is the impression that will irresistibly arise from the pricing and provision structures. There is a story here that can very easily be told:
‘There’s prohibitively expensive AI running on gigantic clusters somewhere, and only the few rich enough to afford them have access to it. To make matters worse, it’s in the hands of a tech elite that is currently at the apex of its political influence; and they are planning to use it to get rid of the jobs that sustain everyday people and the political power that comes with it.’
First, I believe that story is politically effective and will be told. Maybe as a quick way to score general political points for anyone, but mostly specifically by adversaries of AI development and beneficial deployments: There are many entrenched interest groups that fear short-term losses from AI labour market effects that are very likely to leverage this sort of narrative to great effect. Whoever is the equivalent of the longshoremen for frontier AI will be very happy to stoke the fires of the distribution discussion; and there will be a lot of AI longshoremen if even the more pessimistic predictions about AI adoption in the next few years come to pass.
Second, safety advocates may be tempted to leverage the inequality story. A fair share of the safety coalition operates on risk assessments that suggest slowing down AI development at a high price is worth it. On that view, endorsing the inequality concern might seem attractive at first (we saw the first indications of that when the SB-1047 coalition expanded to seemingly unrelated unions). This urgently requires more in-depth strategic reflection, and I’m somewhat concerned that the safety coalition will skip that reflection in favour of a quick win. Issues of inequality tend to produce particularly sticky fronts, and I’ve argued elsewhere that safety politics has already erred into too many entanglements at other junctures. Similarly, I think conflating the inequality narrative with the safety argument would be a big mistake; for coalition integrity, for messaging consistency, and for reputation management.
Third, the inequality story opens a political door to costly ideas of state provision: On the one hand, if something is a limited resource that’s priced prohibitively by the market, some have the instinct that the state should provide it. On the other hand, the inequality story casts AI labs, the current stewards of frontier capabilities, in a very critical light. Combined with the sneaking sovereignty worries around AI (more on that below), and general European proclivities for state-funded capacity build-up, the idea of more active state participation in AI development becomes more politically realistic. So far, that sort of approach does not have a very good track record, and it seems very difficult to imagine how it could play out differently this time. Best not follow that particular temptation.
Misuse & Surveillance
Very generally, the shift to inference scaling seems pretty good for preventing misuse in the short term.
A lot of threat models around misuse were centered around proliferation and accessibility - the idea that AI could enable more people to commit more serious crimes. The aggressors and misusers in these scenarios - Extortionists, NSAs and the like - are usually cast as actors with limited resources and unlimited malicious intent: The kind of people who would gladly use a cheap tool - say, run a local open source model, query some unrestricted or jailbroken version of a model hosted somewhere, to enhance ability to do harm, but would generally not spend big on setting up their AI infrastructure. In the inference scaling era, their pathways seem at first more constrained: The only way to query a leading model and mobilise its capacity for harm is to spend big on one of the very few services backed by clusters that can run the inference. But these are usually fairly on top of their anti-jailbreak game, can be quite readily constrained by liability and regulation, and have outsized economic incentive to not function as a criminal accelerator. There is way less room for shady alley peddling of top-tier AI outputs in the inference scaling world.
This is pretty good news. But it’s something to be mindful of for two reasons:
First, momentary relief might create policy path dependencies that hinder future misuse mitigation. Over the course of the next year, it might be easy for opponents of regulation to present the shift as a reason not to deal with misuse in the laws being passed right now. Assume, for instance, that state policymakers followed that argument and allowed for broad open source exceptions in laws that are intended to address risks a couple of years down the road. It seems a foregone conclusion that costs will go down sufficiently to enable decentral, more misuse-prone AI deployments soon enough. And it seems unlikely that the legislature would keep pace with these market dynamics if the initial law was passed under the assumption that decentralized misuse was of no concern.
This sort of politicking is not a hypothetical spectre. These opponents of misuse-focused regulation exist, and they’re particularly fervent and well-organized where open source, one of the main culprit of long-standing misuse worries, is concerned. Take only the arguably false testimony of VC firm Andreessen Horowitz to the UK government below. They are part of a broader anti-regulatory coalition that played a major role in the ultimate failure of California’s SB-1047; they’ll be back for next years’ battles, too.
Second, under the cover of misuse prevention, the risk of restrictive state control and surveillance increases. Surveillance becomes generally more feasible because there are fewer nodes that have to be controlled. For anyone to receive a top-tier AI output, it needs to go through a big cluster. So lock down the clusters, and you lock down the ability of people to receive outputs. Lock down the GPUs, and you lock down the ability of people to generate that capability. It also becomes more attractive for the very same reason that implied the good news for the misuse case: The surveillance might actually be reasonably effective in preventing misuse in the short term.
Now as detailed in the section above, this is not likely to last. Costs will go down, capabilities will proliferate, users will find a way. But the laws are being made now, where there’s some logic to motivate it. And once empowered, governments aren’t usually very good at giving up surveillance and enforcement capability once they’ve gotten it. So the window right now requires some degree of farsight: If one is worried about the securitisation even of mundane civilian uses, the curtailing of democratised capabilities that might come with it, or the stymying of free market activity to push benefits from AI, then now might be a very good time to be particularly watchful.
Sovereignty
Most countries are not remotely strategically sovereign regarding AI: The few leading models and the clusters on which they are trained are concentrated in the US and China. This has given policymakers virtually everywhere some pause, but has not inspired the kind of more alarmed awareness and more decisive action that might be appropriate. Outside the US & China, specifically building out AI capacity as opposed to considering it one of many generally nice-to-have ‘future technologies’ is not a mainstream goal. But I believe the shift to inference scaling might change some minds on the topic of sovereignty. (It probably won’t reach those that dismiss sovereignty because they dismiss AI in general, but I suspect that group will shrink quickly anyways.)
In the pre-training scaling era, it was easy to frame frontier development gaps as a question of global distribution of labour: Some countries, with their energy prices and talent pools, are simply more suited to producing the models, others might be better at integrating, deploying or leveraging them. This has snuck in under the disguise of a ‘realistic’ attitude to AI politics that has become particularly pervasive in Europe, and, which might be paraphrased as “Let the major AI powers build their silly little clusters and train their models. We’ll just run free OS models, buy competitively priced API access, thereby benefit from tough races with tight margins and create value further down the chain.” In that world, the notion of really losing access to high-quality model access is unrealistic; you don’t need as much infrastructure to run a model, and surely, you’ll get one somewhere somehow. That view benefitted from parallel to other industrial policy: It’s sometimes uncomfortable, but generally quite bearable to complex economies to have some crucial supply chain elements only be produced with overseas allies.
But through the shift to inference scaling, you can no longer be a reasonably independent ‘model importer’ without the compute resources to run the inference yourself. So the entire logic of dismissing the sovereignty concern crumbles: Having your own massive data center infrastructure with the associated organisations to leverage them is no longer just a question of capacity to build a valuable infrastructural product. It’s also the question of capacity to use this product within your own borders. Or, put inversely: If you don’t have the computing infrastructure to develop frontier models, it doesn’t help you much to get theoretical access to run local versions; you won’t have the compute to get the results out of them. Sure, this is, to some extent, the case already in that even non-reasoning-models need some compute capacity to run - but it seems in principle feasible to ramp up e.g. European capacities enough to match the test-time needs of non-reasoning models. No more in the inference paradigm. This changes the conversation: it shortens planning horizons and entirely dispels the illusion of a carefully chosen mutual dependency. There’s a reason why virtually no successful country is dependent on the uninterrupted import flow of a critical resource.
Dependencies are tolerable and sovereignty is optional where resources may be stockpiled – but in the age of inference scaling, there are no strategic intelligence reserves. You’re either sovereign or not. Policymakers will take note.
Outlook
The type of scaling paradigm matters to a lot of foundational assumptions that have shaped AI politics over the last years. With the advent of inference scaling, some of these assumptions change radically. I expect political instrumentalization of inequality concerns, the tension between surveillance and misuse prevention, and the need for strategic sovereignty to play a much bigger role than before.
I try to post updates on my writing, but not on much else, on X / Twitter.
Problems in the AI Eval Political Economy
Evaluations of new AI models’ capabilities and risks are an important cornerstone of safety-focused AI policy. Currently, their future as part of the policy platform faces peril for four reasons: Entanglements with a broad AI safety ecosystem, structural incentives favouring less helpful evals, susceptibility to misinterpretation, and normalization of inaction.
Evaluations of new AI models’ capabilities and risks are an important cornerstone of safety-focused AI policy. Currently, their future as part of the policy platform faces peril for four reasons: Entanglements with a broad AI safety ecosystem, structural incentives favouring less helpful evals, susceptibility to misinterpretation, and normalization of inaction.
Introduction
Evaluations of AI models are a mainstay of the AI safety environment: They provide vital insights into models’ capabilities (what they can do - this is most important to how models can be misused) and propensities (what they are likely to do - this is most important to how models might misbehave themselves). Insight from evals might inform developers’ choices around deploying their models. It’s also an important part of policy instruments: Wherever policymakers want to tie limitations and obligations to levels of capability and risk, trustworthy and reliable evals are important. This is especially true given the inaccuracies of other measurable proxies for capability and risk: Without evals, much safety policy is a blunt instrument, prone to stall progress and susceptible to political attack. If policy is to be a pillar of building safe AI, evals have to be a cornerstone. But they face looming political risks.
The Eval Policy Platform
A successful eval ecosystem is broadly dependent on two kinds of policy support: Evals as policy instruments, e.g. using eval results as a basis for product standards, release conditions, etc; and eval ecosystem support, e.g. through grants and state contracts. This distinguishes safety-focused evals from assurance services focused on a narrower band of capabilities and risks. Those often need neither angle of political support, because they are closer related to myopic commercial interest and are supported by organic market demand.
But evals do need that support if they’re going to be valuable to policy, and so far, they’ve received at least some of it. This success is the result of genuinely impressive technical work and effective advocacy. It’s also due to evals being a relatively inoffensive policy platform: They have the air of scientific neutrality, they are in principle palatable to all political groups around AI, and as long as requirements to comply aren’t too high, they place fairly little burden on industry. As such, building policy around evals is a key area of focus for safety-oriented AI policy. I believe this policy agenda faces rougher headwinds in four areas – and try to provide some brief ideas for the future course.
These are not comments on the technical content and usefulness of evals. Others are much more qualified to comment on those. There is a seperate debate to be had about the insights that evals can provide independent of policy frameworks — this is not a comment on that debate, and not an endorsement of evals as a technical tool for AI safety.
Entanglements and Conflicting Interests
First, eval organisations are caught in an odd web of incentives and associations. Currently, there’s a substantial overlap between the AI safety policy ecosystem and eval orgs. That’s for good historical reasons; safety-focused funding ensures orgs can focus on the kind of horizon-focused eval that the narrow assurance market cannot cover, and guaranteed their survival in the early days of the advanced AI ecosystem. But that structure could prove to be a political burden because it prompts visible conflicts of interest. Many prominent eval orgs are funded by the same sources that also fund ostensibly independent policy organisations pushing for evals as policy instrument - in effect advocating for their sister organisations’ ongoing funding and employment. Transitioning to a for-profit eval ecosystem is a treacherous path as well – California AI safety leader Dan Hendrycks’ parallel involvement in the drafting of SB-1047 and for-profit eval company Gray Swan was ultimately remedied, but arguably cost a lot of intermediary political support.
This is readily available and very effective political ammunition for opponents, and I would strongly expect future policy discussions to include safety skeptics drawing lines between policy advocates and eval orgs that stand to benefit, e.g. via funding links, personal histories and current entanglements. If there is to be any hope that evals take a prominent role in informing and implementing policy, this vulnerability should urgently be reduced. The way forward might be a clearer separation behind the technical services that eval companies provide and the political discussion around where they should be employed. In that environment, eval companies would be stricter for-profit entities with less mission association; and other stakeholders with no stakes in eval industry, from governmental AI safety institutes to third-party civil society, would employ and thereby direct their services. This would seperate ‘doing good evals’ more clearly from ‘obligating others to order evals’.
Incentives Favour the Wrong Evals
Second, the current environment doubly incentivizes dramatic capability evals. First, findings on what new, crazy, scary thing a new model can do play well with media, founders, and on social media. This places pressure on eval organisations to prioritise evals that could lead to those results, and amplifies the parts of their work that focus on them. Propensity results are much harder to communicate in that way: Reporting and perception might, at best, differentiate between a >0% (‘model could’), a >10% (‘model can’) and a >90% (‘model will’) result, but the headline is almost as good with 1% as with 50% — there’s just very little attention on the market left for the step to more differentiated results. Second, capability findings play well for developers – for instance, it seems obvious that OpenAI’s claims to be on track to AGI benefit from eval results confirming concerning degrees of scheming and planning ability; or that cybersecurity or biosecurity risks implicitly point at respective capabilities. Developers can boost these findings, e.g. by how they distribute and provide eval access to their models, by what to feature on their system cards, etc. So far, they have very effectively played on the edge of allowing, using, and boosting somewhat scary capability results to their own advantage. This is a risk because it invites slowly getting used to increasingly alarming eval results, as discussed further below; and because it places evals under suspicion of collaboration. The more eval results appear as part of a carefully calibrated PR system of frontier labs that draws on AI safety worries, the less they’ll enjoy unequivocal support as policy instruments — and the supposition of such a hype-generating safety ecosystem is neither new nor rare.
All of this is not entirely the fault of eval organisations. Much has to do with media reception, political instrumentalisation and commercial interest. But they’re still the ones who can most readily change this political economy for the better.
Evals will be in a weird spot between burden and boost to model releases until a more structured policy framework around evals that hinges less on the voluntary decisions of developers is introduced. Until then, the production and proliferation of striking capability results can boost the development of potentially unsafe models and weaken evals’ strategic position down the road. It might also come at the expense of potentially interesting academic propensity work or narrower and more deterministic harm vector examinations (others have talked about this at much better length and detail). This is all unfortunate on its own, but it becomes a big liability in conjunction with the next point:
The Rabbit-Duck-Problem
Third, many eval results invite drastically divergent interpretations of the same findings - a ‘rabbit-duck-problem’. Statements of the type ‘a model does x under conditions y’ reliably runs into the same reception: Whoever was worried about the model ‘doing x’ sees the eval as a confirmation that the model can indeed do x - a dangerous capability increase! But that view is often mistaken: The possibility of eliciting dangerous model capabilities is often simply a corollary of general and unconcerning capability increases, whereas the concerning element lies in propensity or system-level accessibility of these capabilities. Whoever was not too worried about these capabilities, on the other hand, will focus on condition y - of course the model demonstrated a dangerous capability, it was prompted to do so! That interpretation is likewise also sometimes a bit simplistic – AI models often get prompted to do things in the wild as well, and we presumably do not want to widely deploy models that are one bad prompt away from dangerous scheming and self-exfiltration.
But as long as this rabbit-duck-problem persists, evals are of limited use to inform policy. For evals as a criterion for applying restrictions, this much room for interpretation makes agreeing on concrete thresholds less likely: For instance, imagine trying to define a scheming threshold beyond which a model should not be deployed or gain access to the internet. With equal conviction, safety advocates will argue that this threshold is reached when the scheming is elicited, and regulation critics will argue the threshold is reached through toy examples. For evals as a broader contribution to policy discussion, the vagueness makes for bad evidence; reliably, eval results that allow for such different interpretations will be pointed at by some and dismissed by others.
To support friends of evals in upcoming policy discussions, a shift in the content of ‘headline evals’ that make it into abstracts, media outlets, system cards and press releases would be very helpful. Loss of control and unreliability might be better covered with a stronger focus on propensity assessments, which leave less room for the ‘you asked the model to misbehave’-objection. Misuse might be better covered with not just an independent capability and propensity note, but also more direct connection to counterfactual impact (e.g. a greater focus on explaining the possible harm arising from a models’ likely help in cyberattacks). To be clear: This is a political wishlist, and it might not work too well with the technical agendas and limitations of eval organisations. But it might be useful to keep in mind in charting the future course.
Normalizing Inaction
The more evals you do that don’t have any consequence on deployment, the less impactful future evals will be.
Fourth, focusing on eval capacity buildup and anchoring evals into the default release process without active progress toward binding and operationalized structure can be harmful: the current environment is getting everyone used to not reacting to eval results. Currently, there is no active jurisdiction in the world that requires new models to clear specific eval results before release. Voluntary commitments by developers are also often vague and rarely genuinely constraining. As a result, model releases are increasingly often accompanied by potentially fairly concerning eval results – that have absolutely no tangible impact on the models’ deployment. Because eval organisations are aware of their limited say on deployment decisions, there are rarely ever concrete recommendations on if and how to deploy the model; the eval results are just provided, and the deployment runs its course.
This erodes norms of approrpriate reactions to evals. This is part of a broader issue plagueing AI safety, but it becomes obvious here. There’s two plausible ways to read whats’s going on with evals today:
Either eval results are indicative of seriously dangerous current capabilities (as some reactions would suggest). In that case, it would be a veritable scandal that models are still being developed and deployed without oversight. Eval organisations should then be much more aggressive about suggesting to cease development and deployment immediately, communicate their concern and objections to the public and policymakers, etc — both to protect from AI risks and to protect their credibility. (I don’t believe this is true.)
Or eval results are indicative of potentially noteworthy trajectories, and will only be informative regarding current risks sometime later. If that is true, current AI model releases are acceptable from the safetyist view. In that case, the dramaticism around their ‘risky’ capabilities is way overblown and provides plenty of reasons to discount future, more genuinely alarming results. The entire safety ecosystem would do very well to be very cautious about campaigning on these sorts of findings, then, or be accused of having cried wolf yet again.
In either case, the current combination of brakes-off deployment supported by dramatically reported-upon capability results will cost credibility.
This dynamic risks providing future justification for ignoring evals. There are easy-to-draw, hard-to-debate (false) equivalences. For instance, imagine a policy discussion around deferring to concerning evals for deployment decisions; and imagine that o1 causes no substantial damage until then. Regulation skeptics will be quick to point out that eval orgs had been skeptical of o1 too, but it had been released anyways and caused no harm, so their opinion should not be given much weight.
Both these issues would be much more manageable if evals culminated more visibly into some tangible recommendations ranging from ‘don’t release this at all’ to ‘release this with no limitations’. Similarly, they might culminate in a clear assessment whether the eval is relevant for assessing the model or the trajectory, i.e. distinguishing between ‘we think this eval is relevant to assessing this current model’ and ‘we think this eval is relevant to assessing technological progress toward future models’. That recommendation doesn’t necessarily have to come from the eval orgs themselves, but could follow some result aggregation mechanism or neutral synthesis – lots of options come to mind. The important part is that there is a clear way in which external parties can assess whether anyone has in fact ‘cried wolf’ for a given model. This makes the wolf-crying distinct and actionable, and it reduces risks of alarmism allegations. Understandably, this is tricky terrain for eval organisations, as it brushes against the edges of their mandate and requires them, or someone, to make broader normative determinations beyond empirical observation. But as long as evals don’t provide tangible recommendation classes, chances that they’ll continue to be impactful drop with every vague inaction.
This is also why the impressive inroads made by the evals field, from increased funding and technical skills of eval orgs to increased media attention and increased integration into model release pipelines, are not necessarily progress all things considered. When evals become a cornerstone of a currently somewhat defective policy ecosystem, their credibility as future basis for fundamentally different frameworks suffers. More uptake of evals in the current situation might make things worse.
Is AI safety going to be alright?
...
Yeah, we just need to do more evals.
...
Is AI safety going to be alright? ... Yeah, we just need to do more evals. ...
Conclusion
Evals are an important cornerstone of designing safety-focused AI policy without undue burdens or imprecise instruments. But their future as a policy instrument faces peril: The current organisational structure of eval organisations is politically vulnerable, incentives point evals towards sensational results susceptible to misinterpretation, and their unclear current role provides precedent for ignoring them. Changes toward a clearer division of labor that includes eval organisations with no entangled policy interests and stronger mechanisms to tie evals to tangible recommendations and predictions would be a very welcome setup for the policy discussions ahead.
I try to post updates on my writing, but not on much else, on X / Twitter.
The End Of The Beginning For AI Policy
Did pushes for frontier AI regulation jump the gun? In this piece for Tech Policy Press, I argue: Mostly yes.
Did pushes for frontier AI regulation jump the gun? In this piece for Tech Policy Press, I argue: Mostly yes. Full piece here.
AI Safety Politics after the SB-1047 Veto
After California Governor Gavin Newsom’s veto of the controversial SB-1047, political dynamics everywhere might have irreversibly changed to the detriment of safety-focused AI policy: Economic concerns are boosted, doubt is cast on compute thresholds and enforcement avenues, and the debate is polarised.
After California Governor Gavin Newsom’s veto of the controversial SB-1047, political dynamics everywhere might be irreversibly changed to the detriment of safety-focused AI policy.
Introduction
The political process around SB-1047, the Californian AI bill strongly supported by AI safety advocates, has come to an end. But the politics of regulating frontier AI everywhere stand to be affected: For the last few weeks, all eyes were on California. The debate around SB-1047, Gavin Newsom’s ultimate decision and his explanations for it have been duly noted. I discuss four potential international repercussions: The veto might provide cover for laissez faire policy everywhere, raises questions around how frontier AI regulation would be enforced, casts doubt on the use of compute thresholds in AI governance, and leaves the debate irreversibly polarised.
The Veto Gives Cloud Cover for Laissez-faire Attitudes
The veto supports economic worries and general disbelief that undermine safety policy.
The veto provides strong political ammunition for opponents of safety-focused policy worldwide. This is a fairly obvious point, so I will just very briefly mention two narratives that can easily be spun from it:
First, the veto motivates an economic narrative: If even California, in its commanding position, is economically wary of safety policy, others can’t afford it, either. The possibility of ‘catching up’ and its interplay with risk-focused policy has long been an important topic in non-US discussions of AI governance, and it is likely to feature even more strongly given e.g. the economic woes of Western Europe and the growing economic relevance of AI. In that political dynamic, anything that can be read as confirmation that safety policy is opposed to productive AI development is harmful - and the veto can be read as just that.
Second, the veto motivates a broader anti-safety story: In many ways, California is the place most aware of the potential pace and impact of AI progress, and policymakers everywhere know it is. And common international political stereotype, typically lacking some nuance, casts its government as fairly left-wing, anti-industry, pro-regulation. So if, at the nexus of safety awareness, a pro-regulation government is still not moved to pass a safety bill, that ought to make policymakers with less expertise and interest in the issue wary of that bill’s merits. Paradoxically, previous attempts of the safety coalition to downplay the impact of SB-1047 exacerbate this issue: If even a purportedly minimal bill could not get passed in California, policymakers elsewhere might believe the case for safety policy could not possibly be very strong.
With a Hesitant California, Safety Policy Enforcement Seems Uncertain
The veto might make international policymakers doubt whether their regulation can be meaningfully enforced in California.
SB-1047 received this much global political attention because of California’s outstanding status as the home state of most relevant AI developers. This also makes it the place where, ultimately, much of the enforcement of more restrictive safety policy proposals would have to happen: If illegal training runs were to be interrupted, provably dangerous models to be shut down, next-generation developments probed and prodded, and reckless AI developers held accountable, it would often need to happen in California. Now that the Californian executive has demonstrated its unwillingness to adopt safety-focused policy even at lower enforcement levels, I imagine that many policymakers, especially AI risk skeptics, will rightfully wonder: What good is making all these laws if the Californians won’t cooperate to enforce them?
Of course, to some extent, that question was always going to be asked - especially where the authority of jurisdictions to regulate frontier AI was called into question anyways, like prominently with the EU. But both the signal of signing SB-1047 and a potentially cooperative Frontier Model Board might have had a reassuring effect. Maybe Sacramento gets no say in the matter of enforcing international agreement or bilateral cooperation after all, and maybe a federal AI safety bill is in the cards, but until then, and given these uncertainties, questions around willingness to enforce and cooperate on enforcement could prove to be useful ammunition to opponents of safety-focused policy.
To illustrate, picture a safety advocate in conversation with a policymaker hesitant about the range of their mandate and the likelihood of enforcement, and compare the implications of the current outcome to either a successful SB-1047, where a strong Californian commitment and a cooperation-ready Frontier Model Board would have reassured international policymakers, or even a world without SB-1047 to begin with, where at least strategic ambiguity would have remained. It seems the safety advocate is now in a much worse position, and that this question will come back to haunt decisive national regulation and international agreements.
The Polarisation Genie Is Out Of The Bottle
Battle lines around AI safety policy are now drawn clearly. That changes debates everywhere.
The discussion around SB-1047 has seen the entrenchment of political fronts around frontier AI regulation, with the safety coalition and some incidental allies on one side and a broad alliance of industry, open-source-advocates and safety-sceptic academics on the other. This would be true whether the bill was vetoed or signed - but the veto leaves the safety coalition with all the harms of a polarised debate and still no law.
First, previously, safety-focused policy could be pitched as common sense policy: prudent if risks manifested, harmless if not. Indeed, the pitch would continue, all sides of the debate accepted that some safety policy would be prudent, even leaders of the large AI corporations. As long as safety advocates were able to believable make this pitch, it reduced the threshold for initiating safety legislation: Historically pro-industry, anti-regulation parties could be convinced that this policy was different, pro-regulation forces could be persuaded that there was little risk of a drawn-out adversarial process, and policymakers on all sides could be convinced there was little political downside. This climate might have enabled some of the more unlikely sources of political supports for AI safety, e.g. among the UK Tories, German Conservatives, or, earlier, GOP senators. Post veto, this becomes much harder: A clear anti-safety coalition has formed publicly, and its existence near guarantees there will be opposition in the parliaments, in the public debate, and within one’s own political backyard. Maybe this common-sense-pitch was a political illusion to begin with - but it was useful to safety policy, and it is now seriously damaged. To a policymaker, pushing safety policy now clearly spells out a bona fide political fight.
Second, any future fight will be harder. While the safety coalition was pretty well-established before this entire debate, the opposition was not. In that sense, this was the easiest that AI safety policy was ever going to get: To come out swinging while any opposition was still rallying, and hopefully passing the law before any would-be sceptics knew what hit them. Now, whenever one of the prominent supporters shows any movement on the issue, puts forward a bill or suggestion, somewhat organised opposition will be ready to mount in the early stages, with many pro-safety leaders already cast as broadly scrutinized and observed bogeymen. And given the hardened fronts from this debate will likely transfer to debates in other jurisdictions as well: Many organisations and interests groups are the same, the debates are conducted on international platforms to begin with, and the lineup of Californian / US experts central to the SB-1047 debate is often invoked overseas as well.
Compute Thresholds Might Be Politically Damaged Goods
It’s noteworthy and impactful that the veto justification specifically mentioned compute thresholds.
Newsom advances many reasons for vetoing the bill, and much has been said about how seriously they should be taken. They might well be best understood as politically opportune messaging as opposed to his real motivations. But completely dismissing them still seems like a somewhat abrupt overcorrection, especially when it comes to reasons that do not seem like obvious political home runs. At the very least, something must have made Newsom feel that mentioning compute limits would be politically prudent, and potentially, that something might even be that it played a role in his decision. That means two things:
First, it goes to show that the compute limits might have genuinely been an unpopular element of the bill, following salient complaints that the threshold lines are often vague or arbitrary proxies. Even the fiercest advocates of compute-based governance often concede some of these points, but argue there’s no better method. Frustratingly, that might be a good argument for movement-internal discussions around the comparative merits of assessment methods, but it does not help much when discussing whether to pass a law at all or not: If a sceptic argues that a law should not be passed because it cannot accurately discriminate among models, they do not need to provide a constructive alternative; they would be content with no law at all.
Second, even if the compute thresholds played no role in motivating the veto, they are now a politically damaged tool: They are on record as a purportedly veto-motivating element. Likely enough, wherever next an AI safety bill comes up, it will still include compute thresholds; and it will be easy pickings for any opportunistic opponents of that bill to identify that relevant similarity to SB-1047 and point at Newsom’s specific criticism. This will hang around the neck of safety policy. It’s easy to write in a piece like this, but if they really are the best possible proxy for risk categories, compute thresholds will need a much better defence.
Outlook
Policymakers anywhere won’t miss the implications of this veto. Being cognizant of its likely effects might help the safety coalition preempt some of the more dire consequences for its policy case, and so I believe it’s valuable to have a discussion around them. This does not make this piece an indictment of the political strategy around SB-1047 itself - the politics did not work out in the end, but that is not the same as saying they were mistaken from the beginning. I look forward to discussing!
I try to post updates on my writing, but not on much else, on X / Twitter.
Preparing Policymakers for Non-linear AI Progress
In this piece for Tech Policy Press, I argue that the likely jagged curve of AI progress raises challenges for effective and credible policy work.
In this piece for Tech Policy Press, I argue that the likely jagged curve of AI progress raises challenges for effective and credible policy work. Link here.
Corporate AI Labs’ Odd Role In Their Own Governance
Modern AI labs should be treated more like any other corporation: Accept that they follow business interest, don’t expect them to impede their profitability via corporate governance, and be skeptical about their policy contributions.
Plenty of attention rests on artificial intelligence developers’ non-technical contributions to ensuring safe development of advanced AI: Their corporate structure, their internal guidelines (‘RSPs’), and their work on policy. We argue that strong profitability incentives increasingly force these efforts into ineffectiveness. As a result, less hope should be placed on AI corporations’ internal governance, and more scrutiny should be afforded to their policy contributions.
This post was co-written with Dominik Hermle.
Introduction
Advocates for safety-focused AI policy often portray today’s leading AI corporations as caught between two worlds: Product-focused, profit-oriented commercial enterprise on the one hand, and public-minded providers of measured advice on transformative AI and its regulation on the other hand. AI corporations frequently present themselves in the latter way, when they invoke the risks and harms and transformative potential of their technology in hushed tones; while at the same time, they herald the profits and economic transformations ushered in by their incoming top-shelf products. When these notions clash and profit maximization prevails, surprise and indignation frequently follow: The failed ouster of OpenAI CEO Sam Altman revealed that profit-driven Microsoft was a much more powerful voice than OpenAI’s non-profit board, and the deprioritisation of its superalignment initiative, reportedly in favor of commercial products, reinforced that impression. Anthropic’s decision to arguably push the capability frontier with its latest class of models revealed that its reported private commitments to the contrary did not constrain them; and DeepMind’s full integration into the Google corporate structure has curtailed hope in its responsible independence.
Those concerned about safe AI might deal with that tension in two ways: Put pressure on and engage with the AI corporations to make sure that their better angels have a greater chance at prevailing; or take a more cynical view and treat large AI developers as simply just another private-sector profit maximizer - not ‘labs’, but corporations. This piece argues one should do the latter. We examine the nature and force of profit incentives and argue they are likely to lead to a misallocation of political and public attention to company structure; a misallocation of policy attention to AI corporations’ internal policies; and a misallocation of political attention and safety-motivated talent to lobbying work for AI corporations.
Only Profit-Maximizers Stay At The Frontier
Investors and compute providers have extensive leverage over labs and need to justify enormous spending
As a result, leading AI corporations are forced to maximize profits
This leads them to advocate against external regulatory constraints or shape them in their favour
Economic realities of frontier AI development make profit orientation a foregone conclusion. Even if an AI corporation starts out not primarily motivated by profit, it might still have no choice but to chase maximal profitability: As the track record of Anthropic, the seemingly most safety-minded lab, clearly demonstrates, even a maximally safetyist lab has to remain at the frontier of AI development: To understand the technical realities at the actual model frontier, to attract the relevant talent and motivate the necessary compute investments. The investments that are required to scale up the required computing power to remain at this frontier are enormous and are only provided by profit-driven major technology corporations or, to a much lesser extent, large-scale investors. And a profit-driven tech corporation seems exceedingly unlikely to hinge astronomical capex on an AI corporation that does not give off the unmistakable impression of pursuing maximal profits. If a publicly traded company like Microsoft had serious reason to believe that a model it spent billions of dollars on could just not be released as a result of altruistically motivated safety considerations that go beyond liability risks, their compute expenditure would at least be strategically imprudent and at worst violate its fiduciary duty. This is especially true when compared to counterfactually funding model development at another, less safety-minded, corporation or an in-house lab. And these external pressures might well be mounting in light of rising doubts around the short-term profitability of frontier AI development.
So, even a safety-motivated lab has no choice but to act as if it were an uncompromising profit-maximizing entity - lest their compute providers will take their business elsewhere, and the lab could no longer compete. The effects are clearly visible: When OpenAI board members briefly tried to change course, Microsoft almost took away most of their compute and talent in a matter of hours; in a compute crunch, OpenAI’s superalignment team reportedly got the short end of the stick; and OpenAI might well be gearing up for an IPO doing away with what’s left of its safety-focused structure. The recent, presumable antitrust-motivated, surrender of Apple’s and Microsoft’s respective board seats at OpenAI does cast some doubt on the institutional side of this mechanism, but it is important to note that Microsoft did not have a board seat before they managed to strongarm OpenAI into rehiring Altman - if anything, that episode showed how little the board matters. It would be exceedingly surprising if Microsoft’s main competitors did not have, or at least wrestle for, a comparably iron grip on their respective AI corporations. If they succeed, then these respective labs become full-on profit maximizers; and if they don’t, profit-maximizing labs supported by investors and compute providers might take their place.
This determines the role of AI corporations in policy and governance debate. With AI undergoing rapid technical innovation, financial investment and media awareness, it seems to be destined to be a key technology of the 21st century. This strategic importance will inevitably lead to public attention, political pressure and regulatory intervention in favor of making AI safe, controlled, and beneficial.
Profit maximization is not necessarily at odds with that goal - often, safe and reliable AI is the best possible product to release. Much safety-relevant technical progress, such as advances in making AI more predictable, more responsive to instructions or more malleable to feedback, has also made for meaningful product improvements. This incentive has its limitations - in race situations that offer great economic benefit from being first to reach meaningful capability thresholds, larger and larger risks of unsafe or unreliable releases might become economically tenable.
But more importantly, business interest is often at grave odds with external constraints. It is near-universally accepted that geostrategically and infrastructurally critical industries ought to be externally constrained through regulation and oversight - even where market pressures favor safe products. This oversight is usually thought to ensure strategic sovereignty, reduce critical vulnerabilities, and prevent large-scale failures, etc. Profit maximizers will not willingly accept such constraints, even where they publicly endorse them in principle - because they believe that they might know best, because compliance is expensive, and because unconstrained behavior might sometimes be very profitable. Behind closed doors, the largest AI corporations have been some of the fiercest opponents of early legislative action that might constrain them, whether that is the EU’s AI Act, California’s SB-1047, or early initiatives in D.C. and elsewhere. Where constraints seem unavoidable, business suggests shaping them to be minimally restrictive or potentially even conducive: Companies might advertise for legal mechanisms that best fit their own technical abilities or that prevent smaller competitors from catching up.
This understanding of AI corporations’ profitability motives and their resulting role in policy dynamics should lead us to reassess their activity in three areas: Corporate structure, Responsible Scaling Policies, and corporate lobbying.
Constraints from Corporate Structure Are Dangerously Ineffective
Ostensibly binding corporate structures are easily evaded or abandoned
Political and public will cannot be enforced or ensured via corporate structure
Public pressure can lead to ineffective and economically harmful non-profit signaling
AI corporations, in alleging their binding commitment to responsible development, often point to internal governance structures. But because of the primacy of profitability even in safety-minded labs discussed above, arcane corporate structures might do very little to mitigate it. This is true for three reasons: Firstly, profit maximization can easily be argued to be a necessary condition for a valuable contribution to safety, as it secures the talent and compute required to begin with. So, even under an ostensibly prohibitive pro-safety structure, a single lab might justify pushing for profits and capabilities - look at Anthropic and its recent decision to release arguably the most advanced suite of LLMs to date. Secondly, corporate structures are often not very robust to change driven by executive leadership and major investors - again, it’s easy to point at OpenAI as a recent example of a quick de facto change in governance setup that could not even be halted by one of its founders. Thirdly, in many cases, existing companies might be easily exchangeable shells for what really matters under the hood: Compute that could potentially be reassigned by the compute providers, and research teams that can be poached and will follow the compute. That kind of move might be legally contentious, but between the massive asymmetries in legal resources and potential contractual provisions around model progress in compute agreements, it seems like a highly believable threat. Present corporate structures that suggest a stronger focus on safety or the common good might hence be best understood more cynically: They are only there because they have not interfered with profitability just yet, but can readily be dismissed or dismissed one way or the other. Any other understanding is a set-up for miscalibrated expectations and disappointment, such as in the case of the internal changes at OpenAI. Drawing the right lessons from that is important: Alleging that Sam Altman turned out to be a uniquely deceptive or machiavellian figure, or that OpenAI has undergone some surprising hostile takeover, misses the point and sets up future misconceptions. The lesson should be that there was a failure to understand the real distribution of power in corporate governance.
Expecting corporate governance to enforce public will to make AI safe might not only be mistaken, it could also soften pressure on political institutions to craft meaningful policies that address the societal challenges posed by AI. Insofar as the ineffectiveness of safety-minded corporate structures is not immediately obvious, the public and policy-makers could be inclined to believe that AI corporations were already sufficiently aligned with the public interest. This misunderstanding should be avoided: Public attention should instead pivot towards policymakers and the adequacy of their regulatory frameworks rather than dwelling on corporate board reshufflings. This shift might even benefit AI corporations - by reducing the resources they need to spend on costly charades like complex corporate structures designed to provide an impression of safety focus. Treating them like any other private company might in turn relieve them of the burden of costly non-profit signaling.
Hope In RSPs Is Treacherous
RSPs on their own can and will easily be discarded once they become inconvenient
Public or political pressure is unlikely to enforce RSPs against business interests
RSP codification is likely to yield worse results than independent legislative initiative
Therefore, much less attention should be afforded to RSPs.
Secondly, a rosy view of AI corporations’ incentives leads to overrating the relevance of corporate governance guidelines set by labs. Responsible scaling policies (RSPs) are documents that outline safety precautions taken around advanced AI - e.g. which models should undergo which evaluations, necessary conditions for model deployment, or development red lines - see those from OpenAI, Google DeepMind, and Anthropic. These RSPs are often met with interest, attention, sometimes praise and sometimes disappointment from safety advocates, and often feature in policy proposals and political discussion.
Unfortunately, there is very little reason to believe that such RSPs deserve this attention. Firstly, of course, RSPs by themselves lack an external enforcement mechanism. No one can compel the internal governance of an AI corporation to comply with their RSPs, nor to keep their RSPs once they feel they become inconvenient. RSPs are simply a public write-up of internal corporate governance valid exactly as long as company leadership decides it is. An optimistic view of RSPs might be that they are a good way to hold AI corporations accountable - that public and political attention would be able to somehow sanction labs once they did diverge from their RSPs. Not only is this a fairly convoluted mechanism of efficacy, it also seems empirically shaky: Meta is a leading AI corporation with industry-topping amounts of compute and talent and does not publish RSPs. This seems to have garnered neither impactful public and political scrutiny nor hurt the Meta AI business.
This enforceability is sometimes thought to be answerable through RSP codification. RSPs might be codified, i.e. implemented in the form of binding law. This describes, in effect, a legislative process: For RSPs to be binding or externally enforceable in a meaningful sense, someone would have to be empowered to carry out an external, neutral evaluation of compliance and to enforce the measures often stipulated in the RSPs - for instance whether a model ought to be shut down, planned deployment ought to be cancelled, or training should be stopped. At present, it is difficult to conceive how this evaluation and enforcement should happen if not through executive action empowered by legislative mandate. So RSP codification is in effect simply a safe AI law - with one notable difference: That we do not start from a blank piece of paper, but with an outline of what AI corporations might like for the regulation to entail. The advantages of that approach might firstly come from AI corporations’ relevant expertise - which we discuss later -, or from their increased buy-in. But it seems unclear why exactly buy-in is required: There is substantial political and public appetite for sensible AI regulation, and by and large, whether private companies want to be regulated is usually not a factor in our democratic decision to regulate them. The downside of choosing an RSP-based legislative process should be obvious - it limits, or at least frames, the option space to the concepts and mechanisms provided by the AI corporations themselves. But this might be a harmful limitation: As we have argued above, these companies are incentivized to mainly provide mechanisms they might be able to evade, that might fit their idiosyncratic technical advantages, that might strengthen their market position, etc. RSP codification hence seems like a worse way to safe AI legislation than standard regulatory and legislative processes.
Additionally, earnest public discussion and advocacy for the codification of RSPs may give off the superficial impression to policymakers that corporate governance is adequately addressing safety concerns. All of these are reasons why even strictly profit-maximizing AI corporations might publish RSPs - it quells regulatory pressures and shifts and frames the policy debate in their favor. Hence, affording outsized attention to RSPs and conceiving of RSP codification as a promising legislative approach is unlikely to lead to particularly safe regulation. It might, however, incur a false sense of confidence, reduce political will to regulate, or shift the regulatory process in favor of industry. We should care much less about RSPs.
For-Profit Policy Work Is Called Corporate Lobbying
For-profit work on policy and governance is usually called corporate lobbying. In many other industries, corporate lobbying is an opposing corrective force to advocacy
Corporate lobbying output should be understood as constrained by business interests
Talent allocation and policy attention should be more skeptical of corporate lobbying.
Thirdly, AI corporations’ lobbying efforts currently have a strangely mixed standing in safety policy debates. Leading AI corporations employ large teams dedicated to policy (similar, by all accounts, to government affairs teams at other corporations) and governance (with less direct equivalent). On one hand, extensive lobbying efforts by AI corporations are often viewed critically, especially as they attempt to weaken regulatory constraints. On the other hand, these teams sometimes appear as allies of safety policy advocacy. Labs’ lobbying and governance teams frequently recruit from safety advocates, and entertain close relationships with the safety coalition. In fact, working for an AI corporation is often considered an advisable, desirable career step for non-profit safety advocates. This integration is highly unusual. In other industries, transfers from non-profit work to corporate lobbying happen, but they are usually not considered in service of the non-profits’ goals, and might at best be called green- (or safety-, or clean-,...) washing and at worst betrayal to the cause.
The existence of governance teams, concerned with developing ostensibly impartial ideas on how to govern their technology, is also not very common in other industries. This phenomenon may be best understood in light of the historical context. Given its status as a recently emerging technology with a long prehistory of more theoretical speculation around its capabilities, research labs became generalist hubs on the technology, as no one - beyond the labs and academia - had the technical knowledge of how to regulate it safely, or the political interest to even think about it much. But at least since the ChatGPT release followed by the surge of AI interest and investment, political institutions and civil society organizations are starting to catch up. Dedicated bodies, such as the EU AI Office and the US & UK AI Safety Institutes, have been established and equipped with leading talent, and there is a thriving new ecosystem of AI think tanks and advocacy groups.
It would be a mistake to burden the discussion around frontier AI with tribalistic rhetoric as is often present in regulatory debates of other industries. But its prevalence in virtually all of these other sectors points at which respective roles and relationships have been historically defined for corporate lobbyists, non-profit and academic advisors, and policy-makers. Forfeiting this adversarial dynamic points to a perhaps naive understanding of what profit-maximizing AI corporations will allow their policy and governance teams to do. It is simply implausible that labs would, in due time and following their increased commercialization, pay a substantial department to create policy-related outputs that do not directly further their policy goals. Again, the cynical perspective might be most informative: Profit-maximizing companies pay policy and governance teams to shape policy and governance in a way that maximizes their profits while avoiding researching and publishing any policy proposal that could potentially hinder their AI products and thereby its future financial success. This can be enforced on the executive level by directly vetoing any potent safety policy or - more indirectly - by cutting financial, organizational and compute resources for safety research. On a personal level, employees might be inclined to practice anticipatory obedience and avoid going head-to-head with their employers, in turn protecting their salary, shares, reputation and future impact in steering AI progress from the inside. Besides, working for a big AI corporation building a transformative technology like no other is of course exciting - even a cautious individual could at least be somewhat captivated by the thought of shaping rapid, utopian technological progress and therefore choose to remain on board. No one in these teams needs to be ill-spirited or self-servant for this mechanism to take hold. In fact, there might be a lot of value in creating policy that ensures progress and profit in an industry that promises as much economic and societal value as AI. We just claim that this value does not lie in causing first-order progress on making regulation more safe.
This leads to a dynamic wherein leading governance talent with high standing in non-profit and government AI policy institutions makes suggestions that serve business interests, but face much less scrutiny than corporate lobbying attempts would in other industries. This shapes the policy debate toward the interests of the few AI corporations that currently entertain ostensibly safety-focused governance and policy teams.
Ultimately, this also might result in a misallocation of governance talent. Right now, working for the governance or policy team of a major lab remains a dream job for many ambitious, safety-minded individuals, further fueled by the social and cultural proximity between AI corporation employees and safety researchers and activists. This human capital could potentially be more effectively utilized in less constrained roles, like at governmental institutions or in research and advocacy roles.
This does not apply to safety-minded individuals conducting technical safety work at AI corporations. Much research progress on the technical level is incredibly beneficial to safety, and where it is, incentives of labs and safety advocates align - labs also want to build safe products. The misalignment only exists where policy, i.e. an external force that compels the labs, is concerned. So technical work remains valuable on any cynical understanding of lab incentives - and is also presumably sufficient to cash in on some of the benefits of having safety-minded employees at AI corporations, such as whistleblowing options and input on overall corporate culture.
Conclusion
The days of AI developers as twilight institutions between start-up and research lab are at best numbered and at worst over. The economics of frontier AI development will render them profit-maximizing corporate entities. We believe this means that we should treat them as such: No matter their corporate structure, we should expect them to choose profits over long-term safety; no matter their governance guidelines, ensuring safety should be the realm of policy; and no matter the intentions of their governance teams, we should understand their policy work as corporate lobbying. If we do not, we might face some rough awakenings.
I try to post updates on my writing, but not on much else, on X / Twitter.
Three Notes on ‘Situational Awareness’
I argue the series of events predicted in “Situational Awareness” is less likely: due to lack of market incentives for this progress, implausibility of decisive government response at the relevant time, and adaptation of AI labs to predictions.
A newly published series of essays predicts rapid advancements in artificial intelligence leading to government intervention and subsequent creation of superintelligence. I argue that this series of events is less likely due to lack of market incentives for this progress, implausibility of decisive government response at the relevant time, and adaptation of AI labs to similar predictions.
My discussion does not aim to engage with underlying technical claims. For the sake of argument, I will simply accept them as entirely accurate, though they’ve been critically discussed elsewhere.
Introduction
Leopold Aschenbrenner, formerly of OpenAI, has recently published a series of essays detailing his outlook on the developments ahead in AI. The essays are well-written, well-researched and insightful. Making them even more interesting, Leopold suggests – correctly, I believe - that his articulated view is representative of the beliefs of a larger group of important people in AI.
While I am not as immersed in the SF ecosystem, I have been thinking about and working on related issues for some time, and I feel I stumbled across some inconsistencies. With Leopold, I believe getting these predictions and the resulting policy response right is very important - so I briefly outline three areas where I think his argument gets stuck.
Profitability and Progress
AI capability gains between today and AGI might not be profitable enough to motivate investment.
Developing AGI & ASI might not be sufficiently attractive for private companies.
The essays begin with the observation of rapidly increasing investment in AI, especially in compute clusters and their surrounding infrastructure. Extrapolating these trends then leads to projections for capability milestones. Two strong arguments motivate these trend projections: Empirical observations around current spending, and the potential high profitability of developing AGI. But even on Leopold’s short timelines, there is a time between the current era and the final sprint to AGI and ASI. Leopold does not go into great detail on that era, but I believe crossing it might take much longer than he projects.
Firstly, short-term profitability of the next couple of model generations might not be given. Continuous growth of compute investment, according to Leopold, will have happened ‘as each generation [of models] has shocked the world’. The jump to GPT-3.5 and the ChatGPT application was impressive and motivated plenty of public, economic and political interest. This is maybe less obviously true for the jump to GPT-4, and less still for the progress within the GPT-4 tier of models. On the business side, adoption has been somewhat sluggish, with apprehension voiced at the cost of using advanced models; PR-sensitive reliability issues plaguing consumer-facing deployment; and regulation, liability, and ongoing lawsuits creating further barriers. So far, they do not seem on track to motivate widespread adoption of 100$+ subscription models, as Leopold suggests. These might well be growing pains – but to keep investors happy to fund the trend of compute costs, they would have to be outgrown fast. Otherwise, the well of compute funding might dry up past the already-committed resources, inviting a stronger focus on usability and efficiency than on capability gains. Of course, that might well be a high-revenue area. But as Leopold points out, the capex of major tech companies that drives the progress he assumes is unprecedented - intermediary returns that are merely very high don't cut it. Unprecedented returns are not yet certain.
Secondly, trying to develop AGI and ASI might be much less economically attractive to companies than Leopold suggests. At face value, creating AGI or ASI seems enormously profitable and desirable to any corporation – think of the growth, the power, the competitive advantage. Even if I was correct about lacking short-term profitability, this prospect would seem enough to motivate the enormous necessary investment.
But later, Leopold identifies (a) that governments are very unlikely to let a start-up (and, I assume, any privately-owned company) develop and control such technology themselves, and (b) that any company attempting such development will become a target of all kinds of espionage, sabotage, nationalization and more. If this is right (and I think it might well be), it changes the calculus dramatically. Getting in the sights of serious clandestine espionage is a serious threat to a private company, and being subjected to a ‘hamfisted’ home government response is likewise not particularly tempting. Susceptibility to adversarial attacks would be high, and success would not be very enticing - limited say over The Project and few profits to be made as the government takes over. That prospect is unlikely to be attractive enough to motivate the kind of funding Leopold’s projections require. Simply put: If serious progress towards AGI puts a target on your back and successfully reaching it makes the government take it away, then why try to build it at all?
Given this incentive landscape, a lot of different scenarios come to mind: some rogue company tries anyways; a government project is established even without a wakeup call; CCP-owned projects that don’t face such uncertainties pull ahead; AI progress stalls because the market simply does not incentivize building sufficiently powerful AI to kickstart AGI races; etc. These probably deserve some deeper examination. But I believe that they should at least cast some doubt on the suggested outlook.
Governments’ Wake-Up Moment
If the jump from AGI to ASI is fast and governments are slow, ‘The Project’ is less likely.
In discussing technical challenges around aligning superintelligence, Leopold emphasizes the paradigmatic gap between AGI and ASI, but postulates that we might get from AGI to ASI quickly, i.e. in less than a year. AGI’s failures are low stakes, the world is normal, and it is responsive to ~RLHF; but ASI is alien, its failures catastrophic, the world in upheaval, and alignment unclear. This, he argues, is one of the main challenges in achieving safe ASI – many of our well-precedented techniques for simpler models and systems stop working in the new paradigm, and we might drop the ball.
Simultaneously, Leopold argues that continuous improvement up to AGI and ASI will, at some point, lead the US government to step in and assume oversight over further frontier development. He argues this government endeavour, ‘The Project’, is likely the setting in which ASI (or maybe even AGI?) is built. There is little clarity as to when exactly this might happen, but he suggests it would likely be fairly late and would require a major wake-up moment. A successful instance of catastrophic misuse might be one possible watershed.
These two claims don’t mesh that well:
On the one hand, the governmental wake-up moment would be unlikely to happen before or during early AGI, characterized by Leopold as low-stakes and fairly easily aligned – catastrophic misuse or other outsized unexpected impacts of such a system seem unlikely. But on the other hand, the governmental wake-up moment also can’t really happen later than that, because we will go from AGI to ASI very fast, the government is slow to react, and The Project will be a fairly extensive endeavour including lots of political overhead.
This places a very specific requirement on Leopold’s narrative: The wake-up moment occurs (a) early enough to still consolidate research at The Project – i.e. well before the final sprint to ASI, but (b) late enough to leave no doubts around current relevance and future progress. At this time, the government intervenes decisively and commences The Project. This is a sequence and timing of events seems much less plausible and intuitive to me than many of Leopold’s other claims. It might well still happen like this – but if, by happenstance, the would-be wake-up misuse attempt fails, if a new capability gain is hidden or crowded out by more urgent news, if the CCP keeps a rival project under wraps, etc., this timeline gets thrown off very easily. And if the tight window for the wake-up moment passes, things likely play out very differently after all. That shrouds The Project in substantial uncertainty.
How do the ‘Situationally Aware’ React?
If AI lab leaders agree with Leopold’s predictions, they will try to interfere.
Lastly, if the essays do reflect the general thinking of the ‘situationally aware’, they allow some interesting insights into the likely thinking of major industry players in AI. If their predictions do align with Leopold’s, they might accordingly adjust their strategies – which might in turn affect his predictions. Specifically, two responses come to mind:
As suggested in my first objection, the prospect of nationalization might disincentivize labs to work on AGI. This could lead private sector AI development to adopt a carefully considered equilibrium, where they make sufficient incremental progress to ensure their products are competitive, but stay below the critical line of prompting wake-up worthy capability gains. Defection from this equilibrium is not impossible, but far from certain: Given the costs involved, not that many players could defect, and each of them might have major difficulties justifying their defection to shareholders in a market environment that does not favour races.
More alarmingly, the prospect of nationalization, sabotage and espionage might prompt AI labs to be deceptive about their AGI progress. Whether it is to ensure their own profits, from hybris or from distrust of government, AI lab leaders might have ample reasons to prevent nationalization. Hence, they might make sure not to raise the alarm in the first place. For instance, they might save up computational resources to skip a generation or two, prompting a faster, discontinuous jump to AGI/ASI without raising the alarm; wilfully hamstring or underreport intermediary model capabilities; or simply extensively influence the political process to prevent any interference. This does not only throw a wrench into Leopold’s predictions – it also interferes substantially with a lot of current safety mechanisms that rely on iterative, continuous progress.
There is very little telling how prevalent Leopold’s predictions are with the leadership of AI labs, and what beliefs will ultimately play into their decisions. But I believe any prediction that simultaneously sees them stripped of their passion projects and profit potentials and hinges on somewhat honest, transparent or predictable behaviour on their part is likely to get a lot wrong. Adaption to predictions needs to be accounted for.
I try to remember posting updates on new posts and writing (but not much else) on Twitter / X.
The Economic Case for Foundation Model Regulation
The EU AI Act’s foundation model regulation faces criticism motivated by economic concerns. A deeper look reveals that a strong regulatory focus on foundation models would instead be highly economically beneficial.
The EU AI Act’s foundation model regulation faces criticism motivated by economic concerns around the future of foundation model development in Europe. A deeper look at the EU AI ecosystem and resulting market dynamics reveals that such concerns might be highly misguided - a strong regulatory focus on foundation models would be highly economically beneficial.
I collaborated with Dominik Hermle on this text.
We were happy to see our thoughts shared and developed e.g. in the EU AI Act Newsletter, in Euractiv, by Yoshua Bengio, and an open letter of leading AI experts to the German government.
Introduction
Approaches to AI regulation are torn between an upstream focus on the providers of powerful foundation models and a downstream focus on the practical deployment of these models’ capacities. Recent discussion around the EU’s AI Act has seen the proposal of an article 28b stipulating extensive regulatory requirements for providers of foundation models. This proposal was met with economic concerns regarding the EU’s global position in the AI space. We argue that these economic objections are misguided, and make two core claims: (a) that the EU AI ecosystem will most likely not feature globally competitive foundation model providers, but will continue to consist mostly of downstream deployers of AI; and that (b) foundation-model focused regulation leads to dramatically fewer regulatory burdens on AI EU players and enables a less constrained and more efficient market.
EU AI Regulation At a Crossroads
In April 2021, the EU Commission published its proposal for comprehensive regulation of artificial intelligence in Europe. This proposal, the AI Act, seeks to ensure safe and beneficial artificial intelligence in Europe by preventing harms from misuse and unreliability of AI systems while harnessing their economic and social potential. Following extensive negotiation, the EU trilogue is set to finalize the AI Act shortly.
Regulating Foundation Models
One of the few remaining controversies surrounding the AI Act concerns the regulation of so-called foundation models. Soon after the AI Act was originally suggested in 2021, public and political awareness of recent advances in AI research skyrocketed. This specifically motivated a stronger focus on the cutting edge of AI capabilities, which is driven by foundation models - particularly powerful AI models with a wide range of applications. For instance, ChatGPT is based on OpenAI’s foundational language models GPT-3.5 and GPT-4. Strong expert consensus warned of risks of misuse, or worse, of unreliability and loss of control, from foundation models. In consequence, the European Parliament suggested the addition of an article 28b to the AI Act, introducing requirements and liabilities for foundation models.
The specific details of 28b have since been the meandering subject of negotiations - the common element relevant for our point is that 28b envisions that some of the burden of ensuring safe and legal AI outputs is born by the providers of foundation models, and not only by the deployers ultimately bringing the AI to the customer. For instance, models might have to go pre-publication screening for resistance to misuse or reliability; model providers might be liable for the harms caused by blatantly exploitable security gaps; or providers might be obligated to ensure model outputs cannot violate privacy rights.
The Economic Objection
The parliament’s suggestion of article 28b has faced strong resistance on economic grounds, led by the national governments of France and Germany. Their predominant concern is that regulating foundation models might endanger nascent European foundation model development, such as by French MistralAI and German Aleph Alpha. Facing contentions that EU foundation models are not yet cutting-edge, EU policy-makers point to growing interest and investment - and hope that EU providers will become globally competitive and strategically important. For instance, Aleph Alpha recently made headlines by securing a $500M funding pledge from a range of German businesses. The objection to article 28b then holds that burdening local AI providers with regulation at a crucial moment of their expansion could prevent the EU from catching up to the US and China. In contrast, established AI players like OpenAI, Google and Meta might benefit from a higher regulatory burden further entrenching their position as market leaders. Concerns are exacerbated by gloomy economic forecasts motivating a renewed focus on drivers of economic growth. Beyond the strictly economic, the EU has, in recent years and following geopolitical crises, strengthened its focus on strategic autonomy - a notion that might drive this most recent push to secure homegrown capabilities in key technologies.
We believe the argument against foundation model regulation from economic concerns is misguided and threatens to squander the potential of a comprehensive and effective EU AI Act: Even abandoning foundation model regulation entirely would be unlikely to create global cutting-edge foundation model development in the EU. However, adopting foundation model regulation offers significant, but mostly overlooked economic upsides - while also safeguarding against manifold risks.
EU Foundation Model Aspirations: A Reality Check
Foundation models from EU providers currently are not in a leading position in the international market. Even Europe’s foremost providers and their models - Germany’s Aleph Alpha, France’s Mistral, or, on a broader definition, French-American HuggingFace - lag behind in performance and applications for end-users built on top of them. While GPT-3.5 and GPT-4 record over 100M users per week through ChatGPT alone, it is difficult to find substantial consumer-facing products based on Aleph Alpha’s or Mistral’s models. In the comprehensive HELM evaluation, which considers a range of benchmarks from question answering to text summarization, European models are far behind the state of the art. In its own performance benchmarks, Aleph Alpha only compares their current best public model, Luminous Supreme, to the original release version of GPT-3 from 2020, placing the EU three years behind the curve. Optimists will point out that Mistral’s most recent model compares favourably to Meta’s LLaMa-2; but even that only relates to LLaMa-2’s weakest, 13B parameter version. Already today, EU foundation model development is mostly on the backfoot. And we believe the gap will widen.
To cash in on the economic and strategic potential of AI motivating the opposition to article 28b, a European foundation model provider would have to become a globally competitive large player in their own right. This aspiration is gated by three factors: compute — specialized computational hardware powering model training and continuous operation; data — a large training corpus of text and images that capture a broad range of subjects, tasks, and aspects of life, and talent — capable experts in machine learning and engineering to develop the models.
Lacking Computational Resources and Funding
Firstly, in the modern era of AI development, computational resources prove to be a prohibitive bottleneck for innovation. While some initial performance breakthroughs in deep learning have come from universities, this trend started to dramatically reverse once larger amounts of highly parallel and high throughput compute (like GPUs or TPUs) were needed to achieve better performance. With an ever growing focus on compute, lack of computational resources in the EU could stifle foundation model development. This bitter lesson on the essentiality of compute plays into the hands of the established players that already command large amounts of compute and infrastructure: Google DeepMind; OpenAI and Anthropic and their respective collaborators Microsoft, Alphabet and Amazon; and Meta each have access to some of the most impressive concentration of compute on the planet. From this concentration, running costs are comparatively low, further bolstering established players.
Catching up would be dramatically costly for European foundation model providers. The recent news of Aleph Alpha securing €500M of investment pledges was touted to be a game changer - but it falls short of comparable investments made in the US by at least an order of magnitude. Operating expenses for a compute cluster to train a model like GPT-4 are estimated above $1B, with models under current development likely exceeding such costs substantially. OpenAI received $10B in funding from Microsoft earlier this year - on top of the already massive compute that enabled OpenAI to develop GPT-4, which to this day outcompetes many European alternatives. Even if the EU and member states sought to bridge this gulf with state subsidies, costs would be enormous and success doubtful: Their ambitious IPCEI Microelectronics II encompasses €8.1B in subsidies - distributed across 56 (!) companies. Even if Aleph Alpha was to receive an entire IPCEI worth of funding on top of its recent investment round, this would still barely match OpenAI’s single latest funding influx. There is little indication that sufficient funds for EU providers to catch up to the global frontier in compute exist.
Lacking Data and Talent
Similarly, data bottlenecks favor incumbent providers with connections to big tech corporations. To train larger models efficiently, increasing amounts of data are needed. Incumbent tech companies own huge amounts of text data that can be used for training: Meta, for example, trains their models on public Facebook and Instagram posts. StackOverflow, X/Twitter and Reddit all made drastic changes to their API this year, in large part to prevent the free use of their user’s data for training AI models. No European provider-to-be - and even worse, no European stakeholder at all - has access to a comparable wealth of exclusive data. Moreover, having a large user base creates a positive feedback loop where users generate new data, which can then be used to improve the current model, leading to more users. Incumbent providers have such user bases, European providers do not. If constraints and advantages from data continue to matter more and more, the EU hopefuls might thus face yet another substantial obstacle.
Lastly, in comparison, the EU lacks talent. General concerns on the attractiveness of EU countries to exceptional international talent in computer science, engineering etc. apply to the exceedingly demanding challenge of building foundation models in full force. A recent evaluation of AI researchers in Germany clearly demonstrates this trend’s application to AI - even where European countries manage to foster home-grown AI talent, the best and brightest are quick to leave for global hotspots. And when talent remains in the EU, international, non-native tech companies like Amazon or Meta are top employers. This talent gap grows - because the most qualified individuals are motivated to work with qualified experts, on the most exciting and cutting-edge projects. As state of the art models are mostly developed in North America and the UK, EU countries lose even more home-grown talent to these countries.
Can EU Providers Find a Niche?
Faced with these challenges, many voices have instead suggested that European providers’ business case would be to occupy market niches via comparative cost advantages or specialisation on safety and EU-specific (e.g. GDPR) compliance. We believe this to be a realistic path forward for European providers. However, even with specialized models, European providers face obstacles: economies of scale surrounding compute, data and talent as well as the broad, non-specialized capabilities of foundation models make it exceedingly difficult to outcompete the largest providers. Hence, specialized foundation models offered by smaller providers might be unlikely to capture significant market share - case-specific applications or specialized versions of frontier general-purpose models could be more attractive in many cases. The same effect applies to safety and compliance: Due to expertise and scale, large providers are likely to be able to design more comprehensive and reliable guardrails. Gated by these effects, the EU’s foundation model development would be limited to a small suite of somewhat technologically outdated models with potentially higher operating costs.
While this niche is certainly valuable for some use cases, it is not obvious that it is valuable enough to motivate the elimination of article 28b: Firstly, if EU providers are unlikely to be at the global frontier of performance, it is less obvious how EU foundation models would constitute a strategic technology justifying decisive political intervention on the level of e.g. semiconductors - especially given the substantial costs of 28b’s elimination discussed later. Secondly, it is unclear why such a compliance-focused niche market would even have any reason to be concerned about compliance-focused provider regulation to begin with - instead, article 28b would even seem conducive to the realistic future for EU providers by providing regulatory support for their purported advantages over generalist models. And thirdly, in what is referred to as a ‘tiered approach’, 28b might only apply to larger and more general foundation models; in this case, the European champions and their future business cases would not even be affected by its stipulations whatsoever. At any rate, the niche market argument fails to justify abandoning provider regulation.
In summary, it is unlikely that European providers can become major, globally competitive players. For that, they lack the talent, the compute, and the data, with no apparent way to close this threefold gap - whether article 28b is eliminated or not. But if they are set to become providers of niche applications instead, they don’t require the protectionist response of eliminating 28b.
The Economic Case for Foundation Model Regulation
But while the EU might not be set to become a major nexus of foundation model development, AI is still set to be an ever-growing major part of the economy. Businesses will have their internal processes powered by AIs, existing software developers will build user-facing applications drawing on the power of foundation models, and start-ups will explore novel ways to harness AI for specific purposes - the common market is set to be rich in downstream deployers of AI. The deployer space is set to be much less consolidated around big players - just as with non-AI software today, a multitude of downstream applications for artificial intelligence exists, many of them specific to countries, languages or even single businesses, and none of them as demanding in terms of talent or compute. No obvious countervailing concetration mechanism as powerful as in foundation model provision comes to mind. So just as there are many European apps, but few European tech giants, there will surely be many downstream European deployers, but few providers.
Foregoing comprehensive foundation model regulation would pose a grave economic risk for this future of AI use and deployment in Europe. This is because, in a world without foundation model regulation, downstream deployers are set to carry the brunt of the regulatory burden. For many regulatory requirements that article 28b would impose on upstream providers, its removal would instead shoulder the downstream deployers with even more burdensome provisions. Instead of the providers ensuring their model is not trained on protected data, deployers will have to ensure privacy-compliant outputs; instead of providers training models on safe and ethical guidelines, deployers will have to prevent unsafe or harmful output. Legally, this burden shift might play out in two ways: Firstly, the final version of the AI act might explicitly impose responsibilities for preventing the display of potentially harmful or illegal foundation model outputs on the ultimate deployers. Or the deployers, as the user-facing entity, would simply be entirely liable for their applications’ output - including the potential security risks, privacy breaches, etc. passed on from the foundation model they employ. Either way: Deregulation of upstream foundation models increases the de-facto regulatory pressures on downstream deployers massively. Threefold economic potential lies in regulating foundation models and their provders instead:
Lightening Regulatory Burdens
Firstly and most obviously, vulnerable deployers face economic peril at the extent of regulatory burdens. As it stands, businesses in Europe already struggle with bureaucratic burdens - the software start-up ecosystem in particular struggles with the resulting hostile environment. Ensuring compliance of an application based on an unregulated foundation model is a large task. At minimum, it will require substantial expertise in navigating a range of requirements, extensive development and stress-testing of filters, and similar. Realistically, it might also require enlisting expensive, sought-after and otherwise unnecessary expertise in machine learning to modify employed foundation models to ensure compliance. Such requirements to set up a comprehensive safety division place a heavy burden on the vulnerable early stages of deployment and threaten to stifle a budding downstream ecosystem in the EU. Serious foundation model providers, even the smallest of which have to command a baseline of funding and relevant expertise far beyond many fledgling deployers, are surely more resilient to such burdens. Plus, recall that the EU is rich in deployers and poor in providers - so even if shifting regulatory burden to providers was a zero-sum game, this would favor the EU’s economy.
Lowering Compliance Costs
However, secondly, shifting the burden from providers to deployers will likely increase the overall burden, cost and even the possible extent of regulatory compliance. In ensuring safe and compliant model outputs, providers have a range of technical advantages: Some non-compliant output, especially relating to data protection and privacy, can best be addressed when selecting the initial training data - something only providers can do. Furthermore, many promising approaches to ensure safe outputs apply at the AI model’s initial training stage, with further modification of an already-trained model being a distant, less reliable and more costly second choice. And thirdly, foundation model providers are likely to be much more aware of their models’ specific shortcomings, providing them with further advantages in ensuring compliance. Furthermore, deployer-level regulation begets redundancy: If one foundation model services 100 deployers, provider-level regulation requires one central safety effort, while deployer-level regulation requires 100 separate designs and implementation loops for safety measures. Hence, the system-wide burden of compliance is lower in a provider-focused framework - making this focus not only advantageous to the EU, but overall positive-sum.
Ensuring a Free Marketplace
Thirdly, foundation-model level regulation offers a much more efficient and dynamic AI marketplace. If the responsibility to comply rests with deployers, their solutions for compliance will likely have to be specific to the foundation model they employ, and its respective risks. Since risks and failure modes can vary drastically between different foundation models, it stands to reason that such compliance solutions would not enable ‘plug-and-play’ between different foundation models. And even if they were, plugging and playing might still come with high execution costs, such as when conducting fine-tuning via reinforcement learning. As a result, switching between foundation models would be highly costly for deployers. However, with every element of security compliance already homogeneously built into each foundation model, bespoke compliance solutions shrink and the ensuing costs of switching fall. As a result, the market for foundation models becomes less and less contestable, leading to much more difficult conditions for newcomers and worse terms for deployers and ultimately consumers. This is particularly concerning given already salient concerns of market concentration in foundation models.
Looking ahead, AI in the EU is likely to be heavy on deployers and light on providers. In that environment, foundation model regulation decreases both local and global regulatory burden and ensures fair and functional market mechanisms.
Alternatives to Foundation Model Regulation?
Defenders of deployer-focused regulation argue that regulatory burdens on deployers will lead providers to adapt, incentivizing them to provide compliant and safe models. We doubt this claim for two reasons: Firstly, the market might not be responsive enough to make such incentives stick: Path dependencies in choices of foundation models might make switches costly and unlikely, with further lock-in on the horizon as described above. And secondly, safety is likely at odds with maximum functionality - as AI models get more capable, they also become more difficult to safely align with legal desiderata. Whereas the hard requirements of regulation would ensure the prioritization of safety, a soft market incentive might still lead providers to forego compliance in favor of further capability gains. Empirically, it stands that, due to the unreliability of their products, foundation model providers have not been able - or willing - to make reliable assurances about safety and reliability. Existing market concentration of foundation model development with only a handful of providers exacerbates the issue. Hence, market incentives seem unlikely to reduce economic harms from burdens on deployers.
Lastly, recent debate has seen the proposal of a ‘tiered approach’, in which regulation applies only to the largest foundation models - likely aimed to protect smaller foundation model providers. Overall, such an approach would accommodate many of the concerns above. However, we remain somewhat skeptical: If large, regulated foundation models and small, unregulated foundation models coexist, one of two issues might arise. Either employing an unregulated foundation model places additional regulatory burden on the deployers - in that case, deployers would suffer drastically increased costs from employing the smaller models. It is hard to imagine how these smaller models would then remain competitive, rendering the protection the tiered approach was supposed to afford them ineffective. If employing an unregulated foundation model comes with no additional deployer burdens, however, the AI Act leaves open a conceptual gap - applications built on the smaller models could produce harmful or illegal output with little legal recourse available. A tiered approach is much better than no foundation model regulation, but remains either somewhat inefficient or somewhat unsafe.
Conclusion
Opponents of article 28b’s provider-focused regulation perceive a tension between caution of risks on the side of provider focus and economic potential on the side of deployer focus. However, eliminating article 28b has little economic upside: EU foundation model development is too unlikely to catch up to the global frontier to warrant its protection via deregulation. Instead, the overall European AI ecosystem and its many AI deployers greatly benefit from foundation model regulation: by virtue of less local and global regulatory burden and a stronger and more efficient market. A strong economic case supports regulatory focus on foundation models and their providers.