Document Type
Article
Publication Date
2025
Abstract
Most artificial intelligence (AI) researchers now believe that AI represents an existential threat to humanity. The most dangerous threat posed by AI is an issue known as the alignment problem: the risk that a sufficiently intelligent and capable AI system could become misaligned with the goals and values of its human creators and instead pursue its own objectives to the detriment of humanity, including the possibility of extinction. The tension at the heart of the alignment problem is familiar to scholars of agency, contracts, and corporate law, though it goes by a different name: the principal-agent problem. In the traditional principal-agent problem, an agent has an incentive to act in a way that advances their own personal interests rather than the interests of the principal. This divergence of interests gives rise to agency costs that decrease the value of the agency relationship. To reduce agency costs, the principal and the agent use a variety of alignment mechanisms to realign their interests, such as contracts, control rights, and fiduciary duties. Many of these alignment mechanisms have analogs in the AI context. For example, AI agents respond to incentives built into their reward functions similarly to how human agents respond to performance-based compensation. One of the most important lessons from the literature on the principal-agent problem is that no one alignment mechanism can completely align the interests of the principal and the agent. Instead, parties in an agency relationship use a variety of alignment mechanisms to respond to different types of agency costs. For example, corporations use a mix of contracts, shareholder voting, board oversight, and fiduciary duties to align the interests of managers and shareholders. The same is true of the alignment problem-no single alignment mechanism can prevent AI misalignment. Yet despite the growing literature on AI safety, little attention has been given to the complex, interconnected nature of the alignment problem and the need for a multifaceted solution. This Article aims to fill this gap in the literature. Drawing on complexity theory, the Article argues for a "layered" approach to AI alignment in which a variety of alignment mechanisms are layered together to respond to different aspects of the alignment problem. Layered alignment has significant implications for the governance and regulation of AI. Implementing a layered approach to AI alignment will require a high level of coordination and cooperation between public and private AI stakeholders. This need for coordination and cooperation comes at a time when there is an escalating AI arms race between leading AI companies as well as between nations. To facilitate coordination between AI stakeholders, the Article calls for the creation of an international AI regulatory agency. It is time for us all to start working together-before it is too late.
Recommended Citation
Spencer Williams,
Layered Alignment,
23
U.N.H. L. Rev.
301
(2025).
Available at:
https://scholarlycommons.law.cwsl.edu/fs/488