Acq. 2023·Research

Llama 2

The first large language model from a major AI lab released with weights free for commercial use, democratizing access to frontier-class AI.

Overview

Llama 2 was released by Meta AI on July 18, 2023, in partnership with Microsoft, marking a decisive moment in the tension between open and closed approaches to large language model development. Unlike its predecessor, the original LLaMA (released in February 2023 under a research-only license), Llama 2 came with a permissive license allowing commercial use for most organizations — those with fewer than 700 million monthly active users. The release included model weights in three sizes: 7 billion, 13 billion, and 70 billion parameters, along with fine-tuned chat variants called Llama 2-Chat trained using reinforcement learning from human feedback (RLHF).

Technically, Llama 2 built on the transformer decoder architecture with several refinements over the original LLaMA. It employed grouped-query attention (GQA) in the 34B and 70B variants to improve inference efficiency, and extended the context window to 4,096 tokens — double that of LLaMA 1. The models were trained on approximately 2 trillion tokens of publicly available data, a 40% increase over the original. The Llama 2-Chat models underwent an extensive safety alignment process involving over 1 million human annotations and iterative rounds of rejection sampling and proximal policy optimization (PPO).

The release was accompanied by a detailed technical report that documented not just architecture and training, but also safety evaluations and red-teaming efforts with external partners. Meta distributed the weights directly through its website and through Microsoft Azure and Amazon SageMaker, providing enterprise-grade infrastructure for deployment. Within days, the research and developer community had begun fine-tuning Llama 2 for specialized tasks, integrating it into open-source tooling, and benchmarking it against proprietary models — activities that would have been impossible without access to the weights.

Key Facts

Released July 18, 2023, with model sizes of 7B, 13B, and 70B parameters in both base and instruction-tuned (Chat) variants.
Trained on approximately 2 trillion tokens of publicly available text data, a 40% increase over the original LLaMA.
Llama 2-Chat models were fine-tuned using over 1 million human preference annotations across multiple rounds of RLHF.
The 70B model achieved scores competitive with GPT-3.5 on several benchmarks, including MMLU (68.9%) and HumanEval (29.9%), at the time of release.
The commercial-use license permitted deployment by any organization with fewer than 700 million monthly active users, covering the vast majority of potential commercial adopters.

Why It Matters

Llama 2 effectively gave the open-source AI ecosystem a commercially viable foundation to build upon. Before its release, organizations that wanted to build AI-powered products faced a stark choice: pay for API access to closed models like GPT-4 or Claude, or work with open models that carried research-only restrictions. Llama 2 dissolved that constraint, enabling startups, enterprises, and independent developers to run powerful language models on their own infrastructure, fine-tune them on proprietary data, and ship products without per-token fees or data-sharing concerns. The downstream ecosystem — frameworks like llama.cpp, Ollama, and dozens of fine-tuned derivatives — grew directly from this release.

The release also shifted the competitive and philosophical landscape of AI development. It pressured other labs and companies to reconsider their openness strategies and demonstrated that a major incumbent could release frontier-adjacent model weights without catastrophic misuse — a claim that had been contested. Llama 2 became the reference point for subsequent open-weight model releases, establishing norms around model cards, safety documentation, and acceptable-use policies. It remains one of the most downloaded and fine-tuned model families in history, and its architectural choices directly influenced the design of Llama 3 and a generation of derivative models including Mistral, Vicuna, and others.

The People

Hugo TouvronLouis MartinKevin StonePeter AlbertAmjad AlmahairiYasmine BabaeiNikolay BashlykovSoumya BatraPrajjwal BhargavaShruti Bhosale

Sources

[1]

Llama 2: Open Foundation and Fine-Tuned Chat Models

Hugo Touvron, Louis Martin, Kevin Stone, et al. · 2023

https://arxiv.org/abs/2307.09288

[2]

Meta and Microsoft Introduce the Next Generation of Llama

Meta AI · 2023

https://ai.meta.com/blog/llama-2/

[3]

LLaMA: Open and Efficient Foundation Language Models

Hugo Touvron, Thibaut Lavril, Gautier Izacard, et al. · 2023

https://arxiv.org/abs/2302.13971