Inception: Mercury 2, released on March 4, 2026, is a groundbreaking reasoning diffusion LLM designed for speed and efficiency. With a context length of 128,000 tokens, it operates in a text-to-text modality, processing inputs and generating outputs seamlessly. Mercury 2 stands out with its ability to produce and refine multiple tokens in parallel, achieving over 1,000 tokens per second on standard GPUs. This model supports various parameters, including reasoning levels and structured outputs, making it ideal for coding workflows, real-time voice and search applications, and agent loops. Its compatibility with the OpenAI API further enhances its versatility in diverse environments.
Use Cases
Here are a few ways teams apply Inception: Mercury 2 in practice—from fast drafting to multimodal understanding. Adapt these ideas to your workflow.
Ideal for coding workflows with low latency
Enhances real-time voice and search applications
Streamlines agent loops for efficient processing
Facilitates complex reasoning tasks quickly
Key Features
A quick look at the capabilities that make this model useful in real projects.
Achieves >1,000 tokens/sec on GPUs
Supports 128K context length
Tunable reasoning levels available
Native tool use for enhanced functionality
Compatible with OpenAI API
Specs
Overview
Vendor
inception
Model ID
inception/mercury-2
Release
2026-03-04
Modalities & context
Input
text
Output
text
Context
128,000 tokens
Parameters & defaults
Supported parameters: include_reasoning, max_tokens, reasoning, response_format, stop, structured_outputs, temperature, tool_choice, tools
Defaults: temperature 0.75, top_p 0.95
Benchmark tests: Inception: Mercury 2
We ran this model against a few representative prompts to show its range. Review the outputs below and be the judge.
Text
Prompt:
Write 150 words on how AI might positively upend work, leisure and creativity
Artificial intelligence can reshape work by automating repetitive tasks, freeing employees to focus on strategic thinking, empathy, and complex problem‑solving. Intelligent assistants can draft reports, schedule meetings, and analyze data in real time, reducing burnout and accelerating decision‑making. In the gig economy, AI‑driven platforms match skills to projects with unprecedented precision, expanding access to flexible income. Leisure benefits as personalized recommendation engines curate movies, music, and travel experiences that align with mood and curiosity, while immersive AI‑generated worlds enable interactive storytelling and virtual tourism without geographic limits. Creative fields flourish because generative models act as collaborative partners, suggesting melodies, visual motifs, or plot twists that inspire human creators to explore uncharted styles. By handling routine labor and providing instant feedback, AI democratizes expertise, allowing anyone to prototype inventions, compose art, or coach others. The resulting synergy transforms productivity, enriches personal fulfillment, and expands the cultural landscape for generations to come.
Run this prompt on Upend.AI