Discover Baidu's ERNIE 4.5 VL 28B A3B, released on August 12, 2025. This multimodal model excels in processing both text and images, outputting text with a context window of up to 30,000 tokens. Utilizing a Mixture-of-Experts architecture, it activates 3 billion parameters per token, enhancing its ability to understand and generate language and vision tasks. With a focus on efficient training and inference, it incorporates techniques like SFT, DPO, and UPO to optimize performance. The model supports cross-modal reasoning and generation, making it adaptable for diverse applications.
Use Cases
Here are a few ways teams apply Baidu: ERNIE 4.5 VL 28B A3B in practice—from fast drafting to multimodal understanding. Adapt these ideas to your workflow.
Enhance text and image content understanding
Optimize AI-driven visual and textual analysis
Improve cross-modal content generation
Facilitate large-scale data processing
Support complex reasoning tasks
Key Features
A quick look at the capabilities that make this model useful in real projects.
Multimodal Mixture-of-Experts architecture
28B total parameters, 3B activated per token
Supports 30,000 token context length
Text and image input, text output
Uses SFT, DPO, UPO for performance
Cross-modal reasoning and generation
Specs
Overview
Vendor
baidu
Model ID
baidu/ernie-4.5-vl-28b-a3b
Release
2025-08-12
Modalities & context
Input
text · image
Output
text
Context
30,000 tokens
Parameters & defaults
Supported parameters: frequency_penalty, include_reasoning, max_tokens, presence_penalty, reasoning, repetition_penalty, seed, stop, temperature, tool_choice, tools, top_k, top_p
Defaults: temperature 0.2, top_p 0.95
Benchmark tests: Baidu: ERNIE 4.5 VL 28B A3B
We ran this model against a few representative prompts to show its range. Review the outputs below and be the judge.
Text
Prompt:
Write 150 words on how AI might positively upend work, leisure and creativity
Artificial intelligence holds transformative potential to revolutionize work, leisure, and creativity. In the workplace, AI can streamline repetitive tasks, freeing humans to focus on complex problem-solving and innovation. By automating routine operations, it boosts productivity and reduces burnout, while predictive analytics and AI-driven insights empower smarter decision-making. For leisure, AI personalizes experiences—curating entertainment, travel, and hobbies to match individual preferences—making downtime more engaging and enriching. It can also democratize access to education and cultural resources, fostering lifelong learning.
Creatively, AI acts as a collaborative tool, augmenting human imagination rather than replacing it. Artists, writers, and musicians leverage AI to explore new styles, generate ideas, or refine work, pushing boundaries of expression. It democratizes creativity by lowering barriers to entry, enabling diverse voices to thrive. Ultimately, AI could redefine work as more fulfilling, leisure as immersive, and creativity as boundless—ushering in an era where technology amplifies human potential rather than diminishes it.
Run this prompt on Upend.AI