Microsoft Deploys Agent Mode and Claude 3.5 Sonnet to Office

Microsoft has begun the broad rollout of Agent Mode across Word, Excel, and PowerPoint, transitioning Copilot from a sidebar chatbot into an active, in-canvas collaborator. As detailed in the April 23 launch, the system replaces single-turn text generation with a multi-step execution loop. The update also introduces a significant infrastructure shift, exposing model selection to enterprise tenants so they can route workloads to either OpenAI models or Anthropic’s Claude 3.5 Sonnet.

Execution in the Office Canvas

The new architecture relies on an agentic design where the AI plans, acts, validates, and iterates directly inside the document. Microsoft refers to this internally as “vibe working,” a concept adapted from vibe coding where users declare a desired end state rather than writing granular instructions.

During task execution, the system maintains a human-agent dialogue. It pauses to ask clarifying questions about tone or formatting and provides a real-time execution trace of its actions. This is supported by web-grounded search, allowing the agent to pull live data like market trends directly into documents with citations.

In Word, the agent can draft complete reports from source materials, execute formatting changes across hundreds of pages, and apply specific styles using a slash command that references external files or emails. The PowerPoint integration, currently rolling out through the Frontier program, generates 7-10 slide decks complete with speaker notes and automated layouts based on web research and brief prompts.

Benchmarks and the Accuracy Gap

Excel receives some of the most complex capabilities, utilizing a custom multi-agent approach to handle native spreadsheet functions. The agent can construct complex PivotTables, author formulas, and generate charts.

Performance in spreadsheet environments highlights the current limitations of enterprise agents. On SpreadsheetBench, the new Excel agent achieved a 57.2% accuracy rate. While this marks a significant improvement over previous single-turn generations, it still trails the 71% average achieved by human experts.

This accuracy gap necessitates human-in-the-loop validation for financial and operational data. The addition of Claude 3.5 Sonnet gives administrators a mechanism to address this, allowing them to test whether different reasoning models yield better accuracy for specific analytical tasks. If you are evaluating AI output for your own enterprise tools, this parity between OpenAI and Anthropic models provides a useful baseline for testing complex reasoning.

Availability and Licensing

The April 23 deployment is part of Microsoft’s “2026 release wave 1.” The web versions of Word and Excel, along with the Windows desktop version of Excel, are the first to reach general availability. Mac desktop versions and PowerPoint desktop are scheduled for later in the release cycle.

Microsoft is splitting access based on licensing tiers. The full multi-step capabilities require a Microsoft 365 Copilot (Premium) license. Users without this license are placed on a basic tier, which restricts functionality to limited prompting without the autonomous execution loop.

If you build applications that interface with the Microsoft Graph API, this update changes the assumptions around document automation. Users will increasingly expect native tools to handle bulk formatting and data synthesis autonomously, making your external integrations most valuable when they provide high-quality, structured context for these in-canvas agents to reference.

Microsoft Deploys Agent Mode and Claude 3.5 Sonnet to Office

Execution in the Office Canvas

Benchmarks and the Accuracy Gap

Availability and Licensing

Keep Reading

How to Implement Multi-Agent Coordination Patterns

Google Launches Workspace Intelligence and Workspace MCP Server

Scaling Ecom-RLVE for Verifiable AI Shopping Agents

AI Agents Get Post-Quantum Networking in Cloudflare Mesh

IBM ALTK-Evolve Lets AI Agents Learn From On-the-Job Mistakes