Microsoft Deploys Agent Mode and Claude 3.5 Sonnet to Office
Microsoft is rolling out Agent Mode across Word, Excel, and PowerPoint, adding Claude 3.5 Sonnet and in-canvas task execution to Microsoft 365.
Microsoft has begun the broad rollout of Agent Mode across Word, Excel, and PowerPoint, transitioning Copilot from a sidebar chatbot into an active, in-canvas collaborator. As detailed in the April 23 launch, the system replaces single-turn text generation with a multi-step execution loop. The update also introduces a significant infrastructure shift, exposing model selection to enterprise tenants so they can route workloads to either OpenAI models or Anthropic’s Claude 3.5 Sonnet.
Execution in the Office Canvas
The new architecture relies on an agentic design where the AI plans, acts, validates, and iterates directly inside the document. Microsoft refers to this internally as “vibe working,” a concept adapted from vibe coding where users declare a desired end state rather than writing granular instructions.
During task execution, the system maintains a human-agent dialogue. It pauses to ask clarifying questions about tone or formatting and provides a real-time execution trace of its actions. This is supported by web-grounded search, allowing the agent to pull live data like market trends directly into documents with citations.
In Word, the agent can draft complete reports from source materials, execute formatting changes across hundreds of pages, and apply specific styles using a slash command that references external files or emails. The PowerPoint integration, currently rolling out through the Frontier program, generates 7-10 slide decks complete with speaker notes and automated layouts based on web research and brief prompts.
Benchmarks and the Accuracy Gap
Excel receives some of the most complex capabilities, utilizing a custom multi-agent approach to handle native spreadsheet functions. The agent can construct complex PivotTables, author formulas, and generate charts.
Performance in spreadsheet environments highlights the current limitations of enterprise agents. On SpreadsheetBench, the new Excel agent achieved a 57.2% accuracy rate. While this marks a significant improvement over previous single-turn generations, it still trails the 71% average achieved by human experts.
This accuracy gap necessitates human-in-the-loop validation for financial and operational data. The addition of Claude 3.5 Sonnet gives administrators a mechanism to address this, allowing them to test whether different reasoning models yield better accuracy for specific analytical tasks. If you are evaluating AI output for your own enterprise tools, this parity between OpenAI and Anthropic models provides a useful baseline for testing complex reasoning.
Availability and Licensing
The April 23 deployment is part of Microsoft’s “2026 release wave 1.” The web versions of Word and Excel, along with the Windows desktop version of Excel, are the first to reach general availability. Mac desktop versions and PowerPoint desktop are scheduled for later in the release cycle.
Microsoft is splitting access based on licensing tiers. The full multi-step capabilities require a Microsoft 365 Copilot (Premium) license. Users without this license are placed on a basic tier, which restricts functionality to limited prompting without the autonomous execution loop.
If you build applications that interface with the Microsoft Graph API, this update changes the assumptions around document automation. Users will increasingly expect native tools to handle bulk formatting and data synthesis autonomously, making your external integrations most valuable when they provide high-quality, structured context for these in-canvas agents to reference.
Get Insanely Good at AI
The book for developers who want to understand how AI actually works. LLMs, prompt engineering, RAG, AI agents, and production systems.
Keep Reading
How to Govern Cursor Agent Autonomy With Auto-Review
Configure Cursor's Auto-review classifier to manage agent permissions, evaluate tool context, and prevent unauthorized actions without approval fatigue.
How to Chain Hugging Face Spaces Using the /agents.md Endpoint
You will learn how to orchestrate text-to-image and 3D modeling tools by chaining Hugging Face Spaces together using the universal markdown tool interface.
NotebookLM Gains Cloud Environments and Gemini 3.5 Agents
Google has upgraded NotebookLM to an agentic research assistant featuring Gemini 3.5, secure cloud computing environments, and autonomous web search.
How to Expose the Hugging Face Hub to Coding Agents via hf CLI
Learn how to use the newly redesigned hf CLI to provide coding agents like Claude Code and Cursor with direct access to Hugging Face models and datasets.
Holo3.1 Brings 140ms Local Computer Use Agents to 12GB GPUs
Hcompany released Holo3.1, an open-weights agent framework that runs computer-use tasks locally with 140ms latency and 74.2% OS-World accuracy.