Who Is Actually Writing the Rules for AI Safety?

Who Is Actually Writing the Rules for AI Safety?

While the world watches high-profile legislative debates in Brussels and Washington, a much quieter and more consequential shift is occurring in the technical laboratories of Silicon Valley where the actual parameters of machine learning safety are being defined by those very companies intended to be under scrutiny. This transition of authority represents a silent realignment of power, moving away from public legislative chambers toward the internal engineering departments of private AI developers. As major governments struggle to keep pace with the velocity of technological change, they have inadvertently delegated the heavy lifting of rule-making to the private sector, creating a governance model based more on corporate discretion than public accountability.

This systematic delegation of authority has become the defining characteristic of the modern regulatory landscape. While high-level legal mandates often demand fairness, transparency, and safety, they rarely provide the technical specifications required to enforce these concepts in a production environment. Consequently, the critical gap between legislative intent and algorithmic implementation is being filled by the technical staff of the developers themselves. This vacuum has allowed “de facto” standards to emerge, where the internal documentation and design choices of a few major tech players are becoming the global benchmarks for compliance, largely because no independent public-sector alternative exists to challenge them.

The Silent Power Shift: From Public Legislation to Private Technical Governance

The current state of AI governance reveals a stark disconnect between the assertive language of politicians and the practical realities of software enforcement. Governments have successfully drafted broad frameworks, yet they have largely failed to populate these frameworks with the specific technical tripwires necessary to detect or prevent harm. This failure has forced a reliance on private AI developers to define the boundaries of their own oversight. In this environment, the technical documentation produced by a company’s engineering team often carries more weight in a courtroom or a regulatory audit than the vague wording of a federal statute.

Evaluating this shift requires an understanding of how technical specifications function as a form of law. When a regulator asks for a “safe” model, it is the engineer who decides whether that safety is measured by the frequency of toxic outputs, the robustness against adversarial attacks, or the accuracy of its demographic classifications. By choosing which metrics to track and which to ignore, private firms are effectively legislating the operational meaning of public safety. This role of internal corporate documentation as a replacement for traditional public-sector oversight has created a lopsided power dynamic where the reach of independent regulatory bodies is diminishing relative to the influence of the developers they are meant to monitor.

Mapping the influence of these major tech players shows that they are not merely following rules; they are constructing the infrastructure through which rules are understood and applied. Independent regulatory bodies often lack the high-performance computing resources and the specialized talent required to independently verify the claims made by large-scale model providers. As a result, the public sector is increasingly forced to adopt the proprietary safety benchmarks developed by the industry, further entrenching the “de facto” authority of private enterprise over the public interest.

Market Dynamics and the Evolution of Proprietary Safety Benchmarks

The Vocabulary Gap: Integrating Engineering Realities with Legal Accountability

The distance between the computer scientists building these systems and the policy experts drafting the regulations has created an epistemic vacuum that is difficult to bridge. Lawyers and ethicists frequently use terms like “meaningful human oversight,” but to an engineer working on a complex neural network, that phrase lacks a clear technical correlate. This vocabulary gap prevents regulators from issuing precise instructions and forces them to rely on the interpretations provided by the industry. When engineering realities are not integrated with legal accountability from the outset, the resulting rules are often either technically impossible to implement or legally toothless.

This gap is most visible in the trend toward “black box” optimization, where the complexity of a model makes it nearly impossible for a human observer to monitor its internal logic in real-time. Legal requirements for human oversight often clash with the mathematical reality that these systems operate at a scale and speed beyond human cognitive limits. In the absence of external standards that address this conflict, the default practice has become self-certification. Companies are essentially permitted to define what “oversight” looks like for their specific architecture, a practice that prioritizes operational efficiency over external legal friction and robust public protection.

Growth Projections and the Performance Indicators of Self-Regulated AI Markets

The economic impact of this regulatory delay is profound, as legislative frameworks such as the EU AI Act face postponed deadlines that stretch toward 2027 and 2028. These delays have created a market for third-party auditing services, but even these auditors often rely on the very benchmarks developed by the leading AI firms they are auditing. This circularity in performance indicators means that the growth of the AI safety market is currently tied to the industry’s own self-imposed definitions of risk. As long as the public sector remains behind the curve, the market for safety tools will continue to be shaped by the interests of the largest market participants rather than by independent safety requirements.

Technical standardization is also becoming a key driver of global market competitiveness. Nations that can successfully export their technical standards for AI safety gain a significant advantage in the global economy, as their domestic firms are already optimized for those requirements. However, when these standards are essentially proprietary benchmarks masquerading as public rules, the result is a market that favors established incumbents over innovative newcomers. The trajectory of this market suggests that the ability to set technical benchmarks is the most valuable form of intellectual property in the current decade, influencing everything from insurance premiums for algorithmic risk to the eligibility of firms for government contracts.

The Structural Impasse: Capacity Constraints and the Failure of Public Standard-Setting

A chronic lack of funding and a severe shortage of technical expertise within public standards bodies have exacerbated the structural impasse in AI governance. While private firms offer massive salaries to attract the world’s leading machine learning experts, public agencies often struggle to maintain even a basic staff of technical advisors. This disparity ensures that when technical standards are being negotiated in international forums, the private sector is always the most well-represented and best-informed voice in the room. This imbalance makes it nearly impossible for the public sector to produce original, independent benchmarks that are not heavily influenced by corporate preferences.

Political incentives also play a role in maintaining this state of vagueness. Many policymakers prefer high-level requirements because they are easier to pass through legislative bodies and provide the flexibility needed to avoid stifling innovation. However, this flexibility is a double-edged sword; it ensures that the “real” rules remain unwritten until the moment of implementation, at which point the developer’s internal choices take precedence. This is the core of the “grading your own homework” phenomenon, where the lack of specific public-sector benchmarks allows high-risk algorithmic deployments to proceed without any meaningful external validation of their safety or fairness metrics.

Lessons from recent regulatory failures illustrate the high cost of this approach. The Dutch childcare benefits scandal, where a fraud-detection algorithm disproportionately targeted innocent families, occurred in part because there were no external tripwires to define an acceptable error rate or provide a benchmark for human review. Similarly, legislative efforts like Canada’s AIDA faced significant pushback because they attempted to delegate almost all substantive rule-making to government departments without clear technical definitions. These examples suggest that without a foundation of neutral, technical standards, even the most well-intentioned AI laws will fail to protect the people they are designed to serve.

Navigating the Compliance Void in Global AI Regulatory Frameworks

The implementation of major frameworks like the EU AI Act and various U.S. Executive Orders has hit a significant roadblock due to the absence of “harmonized technical standards.” These standards are meant to be the bridge between the law and the code, but the bodies responsible for drafting them have missed critical deadlines, leaving a compliance void. In this vacuum, companies are left to guess what will satisfy future regulators, often opting for the path of least resistance by adopting the metrics of the most dominant industry players. The consequence of these missing timelines is a period of prolonged uncertainty that favors those with the legal and technical resources to define their own compliance paths.

Outdated legal precedents are also being misapplied to modern machine learning, further complicating the regulatory environment. For instance, many jurisdictions are attempting to use 1970s-era employment guidelines to regulate AI-driven hiring tools. These old rules were designed for paper-and-pencil tests and lack the nuance required to evaluate the subtle biases of a neural network trained on millions of data points. When the law fails to provide modern, technical categories for risk, it forces the judiciary to rely on self-imposed metrics from the companies themselves, undermining the ability of victims of algorithmic discrimination to seek meaningful redress in court.

The impact of this compliance void extends beyond the courtroom to the very design of the technology. When there are no clear external rules, developers prioritize technical optimization—speed, efficiency, and predictive power—over safety features that might slow down the system. This results in a global race to the bottom, where the lack of transparent, public safety protocols allows the most aggressive developers to set the pace for the entire industry. Until the public sector can provide a set of stable, enforceable benchmarks, the definition of what constitutes a “safe” AI system will remain a moving target controlled by those with the most to gain from its rapid deployment.

Reclaiming Control: Innovations in Co-Drafting and Multi-Stakeholder Oversight

One promising path toward reclaiming public control is the emergence of integrated drafting processes that bring technical and legal experts together from the very beginning. Instead of lawyers writing a law and engineers trying to figure out how to follow it years later, co-drafting ensures that the vocabulary and categories of risk are synthesized from the design phase. This approach allows for the creation of rules that are both legally robust and technically enforceable, reducing the reliance on private-sector interpretation. These “two-community frameworks” aim to create a shared language that can bridge the gap between engineering realities and the protection of fundamental rights.

Future growth areas also include the development of decentralized auditing and neutral, public-sector benchmarks. There is a growing demand for independent organizations that can provide third-party validation of AI systems using standards that were not written by the developers themselves. These organizations could serve as a check on the self-certification model, providing regulators with the data they need to enforce compliance without having to rely on the provider’s internal documentation. Moreover, international competition is driving the demand for transparent safety protocols, as countries realize that a reputation for “trustworthy AI” can be a powerful competitive advantage in the global marketplace.

Integrated oversight also requires a shift in how we think about algorithmic risk. Rather than treating safety as a final check at the end of the development cycle, it must be integrated into the entire lifecycle of the system. This means developing neutral benchmarks for data quality, model training, and post-deployment monitoring that are accessible to the public and easy for regulators to apply. By fostering a diverse ecosystem of stakeholders—including civil society, academia, and independent technical experts—the public sector can build the interdisciplinary capacity required to ensure that AI serves the common good rather than just the interests of its creators.

Sustaining the Public Interest: The Future of External Algorithmic Benchmarking

The findings of this report indicated that the current trajectory of AI governance resulted in a significant handover of authority to the private sector. This transition happened not because of a lack of political will, but because public institutions lacked the technical capacity to populate their legal frameworks with meaningful benchmarks. The evidence showed that when safety definitions were outsourced to the regulated entities, the public interest often took a backseat to corporate operational flexibility and technical optimization. The systemic reliance on self-certification and internal documentation created a governance model that functioned more like a closed-door agreement than a transparent public process.

Building the necessary interdisciplinary capacity within the public sector became a primary recommendation for ensuring future accountability. This involved not only increased funding for standards bodies but also the creation of new institutional structures that synthesized legal and technical expertise. The findings suggested that the most effective way to protect fundamental rights was to develop external, independent benchmarks that regulators could use to evaluate systems without relying on the developer’s own metrics. Such benchmarks provided the “external tripwires” needed to detect harm before it could accumulate at scale, shifting the burden of proof from the public back to the provider.

The necessity of external standards was highlighted as the only way to sustain public trust in the era of artificial intelligence. It was concluded that the ultimate test of a regulatory framework was whether it allowed for an independent audit that could lead to a different conclusion than the provider’s own internal assessment. To achieve this, a shift toward decentralized auditing and co-drafting was essential. By establishing neutral, publicly-vetted categories of risk, the public sector prepared itself to manage the challenges of a rapidly evolving technological landscape, ensuring that the rules for AI safety were ultimately written by the public, for the public.

Subscribe to our weekly news digest.

Join now and become a part of our fast-growing community.

Invalid Email Address
Thanks for Subscribing!
We'll be sending you our best soon!
Something went wrong, please try again later