Ai Agents 3 min read

Claude 4.7 UI Guidelines Require Strict Screenshot Downscaling

Anthropic's new best practices for computer use identify click accuracy bottlenecks, providing precise screenshot limits and token configurations for Opus 4.7.

On May 13, 2026, Anthropic published technical guidelines for computer and browser use with Claude, detailing system optimizations for the Claude 4.6 family and the newly released Claude Opus 4.7. The documentation identifies click accuracy as the primary bottleneck for agentic UI automation and prescribes specific scaling constraints and prompt configurations to improve execution reliability.

Preventing Coordinate Offsets

Computer use models rely on precise coordinate mapping to interact with user interfaces. When an uploaded screenshot exceeds the API’s internal processing dimensions, the image is automatically downsampled. This internal resizing causes coordinate offsets, meaning the exact pixel location the model identifies no longer matches the actual user interface on the source machine.

To eliminate these click offsets, developers must pre-downscale screenshots to match the model’s exact native resolution before sending the API request.

Model FamilyMax Long EdgeMax ResolutionTarget Resolution
Claude 4.6 (Opus, Sonnet, Haiku 4.5)1568 pixels1.15 megapixels1280x720 (720p)
Claude Opus 4.72576 pixels3.75 megapixels1920x1080 (1080p)

Adaptive Thinking and Token Efficiency

Anthropic introduced the thinking parameter to control reasoning depth across five tiers: low, medium, high, xhigh (exclusive to Opus 4.7), and max. In complex UI automation, deeper reasoning allows the model to map visual layouts more accurately.

On the OSWorld Verified benchmark, running Opus 4.7 on the low effort setting matches the performance of Sonnet 4.6 on max, while consuming approximately 10% of the token volume. For production deployments navigating difficult interfaces, Anthropic recommends setting Opus 4.7 to high effort. This configuration yields optimal task success rates at roughly half the computational cost of the max setting.

Message construction order also affects scanning accuracy. Placing text instructions first, followed by the screenshot, preconditions the model to focus on the necessary UI targets during visual processing.

Capacity and Orchestration Upgrades

The computer use guidelines follow a massive capacity expansion for Anthropic. A May 6 partnership with SpaceX utilizes the Colossus 1 data center, backed by 300 megawatts of power and 220,000 NVIDIA GPUs. This compute surge eliminated peak-hour restrictions for premium users and doubled the rate limits for developers testing local workflows.

Simultaneously, Anthropic expanded its agent capabilities. The platform now supports multi-agent orchestration, where a single lead agent can manage up to 25 specialized sub-agents. The company also introduced “Outcomes” rubrics for validating success and a research preview of “Dreaming” for background memory consolidation in managed deployments.

Browser Prompt Injection Defenses

Browser automation introduces distinct security risks. Websites containing hidden malicious text can trigger prompt injections, tricking the agent into extracting or executing unauthorized commands. Robustness testing on models from Opus 4.5 onward demonstrates that internal security layers, which scan model activations during processing, reduce successful attack rates to approximately 1%.

If you are building visual automation loops, your immediate priority is updating your image ingestion pipeline. Downscaling screenshots to the 1568-pixel or 2576-pixel long-edge boundaries before the API call will drastically reduce click inaccuracies across your deployment.

Get Insanely Good at AI

Get Insanely Good at AI

The book for developers who want to understand how AI actually works. LLMs, prompt engineering, RAG, AI agents, and production systems.

Keep Reading