Claude's new constitution
- •Anthropic officially releases Claude’s full 'constitution,' a 35,000-token document outlining the model’s core values and ethical training.
- •The document, previously leaked and dubbed the 'soul document,' is now available under a CC0 public domain license.
- •Reviewers include diverse external contributors, notably members of the Catholic clergy, specializing in moral theology and computer science.
Anthropic has officially pulled back the curtain on the underlying ethical framework of its latest flagship foundation model, Claude Opus 4.5. This disclosure follows a recent "leak" where users successfully prompted the model to reveal a massive internal document—often referred to as its "soul document"—which dictates its core values and behavioral boundaries.
The newly released "constitution" is a staggering 35,000 tokens in length, dwarfing the standard system prompt by a factor of ten. Unlike a simple set of instructions provided at the start of a chat, these values are baked directly into the model during its training phase. This process ensures that the AI’s responses align with human-centric principles from the ground up, rather than just acting as a superficial filter applied after the fact.
One of the most fascinating aspects of this release is the breadth of perspectives involved in its creation. Anthropic enlisted a diverse group of external reviewers to audit the document, including prominent members of the Catholic clergy. By incorporating experts in moral theology alongside computer scientists, the company aims to ground its AI in established ethical traditions. This move highlights a growing industry trend toward transparency, as labs attempt to demystify the internal "personalities" of their large language models for a skeptical public.