Acq. 2023·Product

Claude 1

The first large language model built from the ground up with Constitutional AI, proving that safety and capability need not be in tension.

Overview

Claude 1 was released by Anthropic in March 2023, entering a market already electrified by OpenAI's ChatGPT and GPT-4. Anthropic had been founded in 2021 by former OpenAI researchers — including Dario Amodei, Daniela Amodei, and others — who left to pursue a more safety-focused research agenda. Claude 1 represented the first public product embodiment of that agenda, making it a landmark not merely as a chatbot but as a proof of concept for a particular philosophy of AI development.

The model was trained using a technique Anthropic called Constitutional AI (CAI), first described in a December 2022 paper. Rather than relying solely on human feedback to suppress harmful outputs, CAI introduced a set of explicit principles — a 'constitution' — that guided a secondary AI model to critique and revise the primary model's outputs during training. This two-stage process (supervised learning from AI feedback, followed by reinforcement learning from AI feedback, or RLAIF) reduced the need for large volumes of human-labeled harmful content, and allowed Anthropic to articulate and encode its safety values in a transparent, inspectable document.

Claude 1 was made available initially through an API in limited access, with enterprise partners such as Quora (for its Poe platform) among the first integrations. Anthropic positioned Claude as particularly strong on helpfulness, harmlessness, and honesty — the 'HHH' framework that became a signature of their alignment research. Reviewers and early users noted that Claude 1 was notably more willing to engage with nuanced or ambiguous requests than contemporaneous models tuned for refusal, while still declining clearly harmful instructions, a balance that drew significant attention from the research community.

Key Facts

Released publicly in March 2023, approximately four months after the December 2022 publication of the Constitutional AI research paper.
Trained using Constitutional AI with a set of roughly 16 explicit principles drawn from sources including the UN Declaration of Human Rights, as described in the CAI paper.
Employed RLAIF (Reinforcement Learning from AI Feedback) as a core alignment mechanism, reducing reliance on human-labeled examples of harmful content.
Made available via API with a context window of approximately 9,000 tokens at launch — substantially larger than the 4,096-token default of GPT-3.5 at the time.
Anthropic raised a $300 million Series B funding round in May 2023, with Google as a lead investor, in direct response to market interest generated by Claude 1's reception.

Why It Matters

Claude 1 demonstrated for the first time at commercial scale that Constitutional AI could produce a competitive, deployable large language model — not just a research artifact. By making safety a structural design constraint baked into the training pipeline rather than a post-hoc filter, Anthropic offered an alternative paradigm to the dominant RLHF-from-human-feedback approach. This influenced how the broader field thought about the relationship between alignment techniques and model capability, spurring follow-on work at other labs on scalable oversight and AI feedback mechanisms.

The release also marked the emergence of a genuine competitive dynamic in 'safety-first' AI development, lending institutional credibility to the argument that responsible scaling and commercial viability could coexist. Anthropic's approach of publishing the constitutional principles and the underlying CAI research gave researchers, policymakers, and the public an unusual degree of transparency into how the model's values were engineered — a precedent that raised expectations for explainability across the industry. Claude 1 thus shaped not only subsequent Anthropic models (Claude 2, Claude 3) but also broader norms around documentation and accountability for frontier AI systems.

The People

Dario AmodeiDaniela AmodeiJared KaplanTom BrownChris OlahSam McCandlishJack Clark

Sources

[1]

Constitutional AI: Harmlessness from AI Feedback

Yuntao Bai, Saurav Kadavath, Sandipan Kundu, Amanda Askell, Jackson Kernion, Andy Jones, Anna Chen, Anna Goldie, Azalia Mirhoseini, Cameron McKinnon, Carol Chen, Catherine Olsson, Christopher Olah, Danny Hernandez, Dawn Drain, Deep Ganguli, Dustin Li, Eli Tran-Johnson, Ethan Perez, Jamie Kerr, Jared Mueller, Jeffrey Ladish, Joshua Landau, Kamal Ndousse, Kamile Lukosuite, Liane Lovitt, Michael Sellitto, Nelson Elhage, Nicholas Schiefer, Noemi Mercado, Nova DasSarma, Robert Lasenby, Robin Larson, Sam Ringer, Scott Johnston, Shauna Kravec, Sheer El Showk, Stanislav Fort, Tamera Lanham, Timothy Telleen-Lawton, Tom Conerly, Tom Henighan, Tristan Hume, Samuel R. Bowman, Zac Hatfield-Dodds, Ben Mann, Dario Amodei, Nicholas Joseph, Sam McCandlish, Tom Brown, Jared Kaplan · 2022

https://arxiv.org/abs/2212.08073

[2]

Introducing Claude

Anthropic · 2023

https://www.anthropic.com/index/introducing-claude

[3]

Claude's Constitution

Anthropic · 2023

https://www.anthropic.com/index/claudes-constitution