128B Mistral Medium 3.5 Moves Vibe Coding Agents to the Cloud
Mistral AI's new 128-billion parameter dense model introduces configurable reasoning alongside asynchronous cloud-based execution for coding agents.
Mistral AI has launched Mistral Medium 3.5, a 128-billion parameter dense model designed as a unified flagship. The release replaces specialized predecessors like Mistral Medium 3.1 and Devstral 2. Alongside the model, Mistral introduced cloud-based Remote Agents for its Vibe platform and a dedicated Work Mode for the Le Chat interface.
Unified Architecture and Configurable Compute
Mistral Medium 3.5 operates on a dense Transformer architecture featuring a 256K token context window. The model includes a vision encoder trained from scratch to process variable image sizes and aspect ratios natively.
At the API level, developers can now dictate the compute applied to a prompt using a new reasoning_effort parameter. Setting this to "none" forces rapid, standard inference. Toggling it to "high" allocates additional compute for complex agentic runs and multi-step planning.
| Specification / Benchmark | Mistral Medium 3.5 |
|---|---|
| Architecture | 128B Dense Transformer |
| Context Window | 256K Tokens |
| SWE-Bench Verified | 77.6% |
| τ³-Telecom (Agentic) | 91.4% |
The model is released under a Modified MIT License. For on-premises enterprise deployment, 4-bit quantized versions can run on hardware nodes equipped with 64GB to 80GB of RAM.
Session Teleporting in Vibe
Mistral’s approach to vibe coding previously centered on local command-line execution. The updated Vibe platform shifts this workload to the cloud via asynchronous Remote Agents.
Developers can now spawn multiple agents in parallel to handle distinct repositories or feature branches simultaneously. The primary mechanical addition is session teleporting. When a developer closes their local terminal, the active CLI session transfers directly to the cloud. This preserves the full execution state, conversation history, and approval context, allowing the agent to continue working unattended.
These remote instances integrate directly with standard developer infrastructure. Agents can autonomously open GitHub Pull Requests and push updates to enterprise tools like Jira, Linear, Slack, and Sentry.
Work Mode Orchestration
For non-developer workflows, Mistral added Work Mode to Le Chat in preview. This interface uses a dedicated agentic harness rather than standard chat orchestration, turning the model into an execution backend for long-horizon tasks.
Work Mode connects to daily operational tools to orchestrate inbox triage or meeting preparation. The system reads and writes across emails, calendars, and web search sessions. It surfaces draft actions for human-in-the-loop approval before final execution, relying on an underlying workflows orchestration engine to maintain state across the connected accounts.
Pricing and Availability
API usage for Mistral Medium 3.5 costs $1.50 per million input tokens and $7.50 per million output tokens. Access to Remote Agents and Work Mode is currently restricted to users on Pro, Team, and Enterprise plans.
If you are building multi-agent systems, the explicit separation of reasoning effort at the API level allows you to route trivial extraction tasks and complex planning loops to the exact same model. Update your routing logic to test the new reasoning parameter against your specific latency requirements before moving entirely away from specialized smaller models.
Get Insanely Good at AI
The book for developers who want to understand how AI actually works. LLMs, prompt engineering, RAG, AI agents, and production systems.
Keep Reading
Google's 5-Day Vibe Coding Course Returns to Kaggle in June
Learn how to build production-ready agents and use natural language as a programming interface in Google's returning 5-day intensive course on Kaggle.
DeepSeek V4: 1M Tokens for Long-Running Agents
DeepSeek has launched the V4 model series, featuring a one-million-token context window and massive cost reductions for long-running AI agent workflows.
Claude Cowork brings sandboxed agent workflows to local desktops
Anthropic released a five-level enterprise deployment guide for Claude Cowork outlining sandboxed desktop execution, MDM support, and third-party inference.
Temporal Powers Mistral's New Workflows Orchestration Engine
Mistral launched a Temporal-backed orchestration layer to execute multi-step agentic systems with deterministic recovery and VPC support.
Claude Shifts to Dynamic Discovery With 15 Consumer Connectors
Anthropic has expanded Claude's ecosystem with 15 new personal app connectors, using dynamic suggestion-driven discovery to handle consumer tasks mid-chat.