128B Mistral Medium 3.5 Moves Vibe Coding Agents to the Cloud

Mistral AI has launched Mistral Medium 3.5, a 128-billion parameter dense model designed as a unified flagship. The release replaces specialized predecessors like Mistral Medium 3.1 and Devstral 2. Alongside the model, Mistral introduced cloud-based Remote Agents for its Vibe platform and a dedicated Work Mode for the Le Chat interface.

Unified Architecture and Configurable Compute

Mistral Medium 3.5 operates on a dense Transformer architecture featuring a 256K token context window. The model includes a vision encoder trained from scratch to process variable image sizes and aspect ratios natively.

At the API level, developers can now dictate the compute applied to a prompt using a new reasoning_effort parameter. Setting this to "none" forces rapid, standard inference. Toggling it to "high" allocates additional compute for complex agentic runs and multi-step planning.

Specification / Benchmark	Mistral Medium 3.5
Architecture	128B Dense Transformer
Context Window	256K Tokens
SWE-Bench Verified	77.6%
τ³-Telecom (Agentic)	91.4%

The model is released under a Modified MIT License. For on-premises enterprise deployment, 4-bit quantized versions can run on hardware nodes equipped with 64GB to 80GB of RAM.

Session Teleporting in Vibe

Mistral’s approach to vibe coding previously centered on local command-line execution. The updated Vibe platform shifts this workload to the cloud via asynchronous Remote Agents.

Developers can now spawn multiple agents in parallel to handle distinct repositories or feature branches simultaneously. The primary mechanical addition is session teleporting. When a developer closes their local terminal, the active CLI session transfers directly to the cloud. This preserves the full execution state, conversation history, and approval context, allowing the agent to continue working unattended.

These remote instances integrate directly with standard developer infrastructure. Agents can autonomously open GitHub Pull Requests and push updates to enterprise tools like Jira, Linear, Slack, and Sentry.

Work Mode Orchestration

For non-developer workflows, Mistral added Work Mode to Le Chat in preview. This interface uses a dedicated agentic harness rather than standard chat orchestration, turning the model into an execution backend for long-horizon tasks.

Work Mode connects to daily operational tools to orchestrate inbox triage or meeting preparation. The system reads and writes across emails, calendars, and web search sessions. It surfaces draft actions for human-in-the-loop approval before final execution, relying on an underlying workflows orchestration engine to maintain state across the connected accounts.

Pricing and Availability

API usage for Mistral Medium 3.5 costs $1.50 per million input tokens and $7.50 per million output tokens. Access to Remote Agents and Work Mode is currently restricted to users on Pro, Team, and Enterprise plans.

If you are building multi-agent systems, the explicit separation of reasoning effort at the API level allows you to route trivial extraction tasks and complex planning loops to the exact same model. Update your routing logic to test the new reasoning parameter against your specific latency requirements before moving entirely away from specialized smaller models.

128B Mistral Medium 3.5 Moves Vibe Coding Agents to the Cloud

Unified Architecture and Configurable Compute

Session Teleporting in Vibe

Work Mode Orchestration

Pricing and Availability

Keep Reading

Google's 5-Day Vibe Coding Course Returns to Kaggle in June

DeepSeek V4: 1M Tokens for Long-Running Agents

Claude Cowork brings sandboxed agent workflows to local desktops

Temporal Powers Mistral's New Workflows Orchestration Engine

Claude Shifts to Dynamic Discovery With 15 Consumer Connectors