The Gallery
Acq. 2024·Product

Gemini 1.5

A one-million-token context window shattered the practical limits of what a single AI prompt could contain.

Overview

Gemini 1.5 Pro was announced by Google DeepMind in February 2024, introducing a context window of up to one million tokens — roughly 750,000 words, or the equivalent of several full-length novels, an entire codebase, or more than an hour of video. This was an order-of-magnitude leap beyond what any widely deployed model offered at the time. The architecture powering this capability was a Mixture-of-Experts (MoE) design, which allowed the model to activate only a subset of its parameters for any given input, improving efficiency at scale.

The technical report accompanying the release documented that Gemini 1.5 Pro could perform 'needle-in-a-haystack' retrieval tasks with near-perfect accuracy across its full context length, locating a single inserted fact within a corpus of one million tokens. It was also demonstrated processing 11 hours of audio, 1 hour of video, 30,000 lines of code, or 700,000 words of text in a single context. These were not marketing benchmarks — they reflected a genuine architectural rethinking of how transformer-based models handle long-range dependencies.

Google made Gemini 1.5 Pro available to developers through Google AI Studio and the Gemini API in early 2024, initially with a 128,000-token context window that was progressively extended to one million tokens for qualifying users. The model was positioned as a successor to Gemini 1.0 Ultra, focusing not just on raw capability but on practical usability at extreme context lengths. Its multimodal nature — processing text, code, images, audio, and video within a unified architecture — distinguished it from contemporaneous long-context competitors.

Key Facts

  • Context window of up to 1,000,000 tokens — the largest of any publicly available model at the time of release in February 2024.
  • Achieved greater than 99% recall accuracy on needle-in-a-haystack retrieval tasks at the full 1-million-token context length.
  • Uses a Mixture-of-Experts (MoE) architecture, activating only a subset of parameters per token for computational efficiency.
  • Capable of processing multimodal inputs including 11 hours of audio or 1 hour of video in a single context.
  • Demonstrated in-context learning of a new language (Kalamang, with fewer than 200 speakers) from a single grammar manual provided in the prompt, scoring comparably to a human learner.
Why It Matters

Before Gemini 1.5, the practical ceiling for a single model context was roughly 32,000 to 128,000 tokens, forcing developers to build complex retrieval-augmented generation (RAG) pipelines to handle large documents or codebases. A one-million-token window changed the design space: tasks that required chunking, summarization cascades, or external vector databases could now, in principle, be handled in a single forward pass. This represented a shift in how developers could architect AI-powered applications.

The long-term significance of Gemini 1.5 lies in establishing context length as a first-class competitive dimension in large language model development. The release accelerated an industry-wide race to extend context windows, with OpenAI, Anthropic, and others subsequently announcing expanded limits. More fundamentally, it raised the question of whether sufficiently long context might partially substitute for parametric memory — a question that continues to shape research into model architecture, training efficiency, and the nature of in-context learning.

The People
Machel ReidNikolay SavinovDenis TeplyashinDmitry LepikhinTimothy LillicrapJean-baptiste AlayracRadu SoricutAngeliki LazaridouOrhan FiratJulian Schrittwieser
Sources
[1]

Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context

Machel Reid, Nikolay Savinov, Denis Teplyashin, et al. · 2024

https://arxiv.org/abs/2403.05530

[2]

Our next-generation model: Gemini 1.5

Google DeepMind · 2024

https://blog.google/technology/ai/google-gemini-next-generation-model-february-2024/

[3]

Gemini 1.5 Pro now available in 180+ countries; updated Gemini 1.0 Pro coming to the Gemini API

Google DeepMind · 2024

https://developers.googleblog.com/en/gemini-15-pro-now-available-in-180-countries-updated-gemini-10-pro-coming-to-the-gemini-api/