What just happened? Anthropic unveiled a major upgrade to its AI toolset this week with Claude Sonnet 4.5, a new model designed to handle longer and more complex programming tasks while offering significant improvements in instruction-following and real-world business applications. The rollout underscores Anthropic's push to maintain its presence in the increasingly crowded market for AI-powered developer tools, where it faces rising competition from industry leaders such as OpenAI and Google.

According to Anthropic, Claude Sonnet 4.5 can maintain autonomous coding sessions for up to 30 hours – a substantial increase over the company's previous Claude Opus 4 model, which supported roughly seven uninterrupted hours. The firm claims that Sonnet 4.5 is stronger "in almost every way" compared with earlier versions, offering improvements not only in task persistence and execution speed, but also in the range of tasks it can perform on a user's system.

Key upgrades include checkpoints within the Claude Code tool, enabling developers to snapshot and revert their coding progress, a refreshed terminal interface, and an official Visual Studio Code extension for seamless integration with popular development environments. The updated Claude API introduces context editing and enhanced memory management, allowing the AI to handle longer, more complex requests without losing track or slowing down.

For end users, new capabilities include direct code execution and file creation such as spreadsheets, presentations, and documents, directly within conversational workflows.

Anthropic is positioning Claude Sonnet 4.5 as a "frontier model," highlighting its leading performance on industry benchmarks. On the SWE-bench Verified test, which evaluates models' ability to solve real-world coding problems, Sonnet 4.5 sets the current standard. Meanwhile, on OSWorld, a benchmark assessing AI proficiency in practical computer tasks, the model scores 61.4 percent, a significant improvement over the earlier Sonnet 4's 42.2 percent.

This version of Claude also incorporates enhanced defenses against common vulnerabilities in AI agent deployments, such as prompt injection attacks. In addition, Anthropic reports improvements in alignment, a measure of how consistently an AI system behaves as intended.

Executive summaries and public-facing system cards now include results from safety and alignment tests including mechanistic interpretability, demonstrating reductions in undesirable behaviors such as sycophancy, deception, and power-seeking.

Developers can now access the Claude Agent SDK, the same infrastructure used by Anthropic's own teams to build and scale agentic tools. The SDK provides resources for memory management, user permission systems, and coordination among sub-agents.

Claude Sonnet 4.5 is available immediately via the Claude API, retaining the pricing model of its predecessor. Anthropic executives noted that further improvements are forthcoming, with additional Opus model updates expected later this year.