Claude Sonnet 5 Is Official: API ID, Pricing, Context Window, and What to Test First

Anthropic has now made Claude Sonnet 5 official. The launch page is live, the Sonnet product page points readers to the new release, and the Claude Platform docs list the API model ID as claude-sonnet-5.

That matters because this is not just a Claude.ai product update. Sonnet 5 is the model Anthropic is positioning for everyday professional work, coding, browser and computer use, and multi-step agents where Opus-level capability has often been attractive but too expensive for high-volume use.

The short version: Sonnet 5 narrows the gap with Opus 4.8, keeps Sonnet economics, and gives developers a new default candidate for production agent workflows. The longer version is more interesting, because the release also changes how teams should run evals, price migrations, and route work between Sonnet and Opus.

What Anthropic announced

Anthropic's official post, Introducing Claude Sonnet 5, says the model is designed to be its most agentic Sonnet model so far. The examples Anthropic emphasizes are planning, tool use, browsers, terminals, coding, and knowledge work.

The company also says Sonnet 5 is available across all Claude plans. It is the default model for Free and Pro users, available to Max, Team, and Enterprise users, and available in Claude Code and on the Claude Platform. For developers, the important line is simple: use claude-sonnet-5 through the Claude API.

The Claude Platform models overview has also been updated. It now lists Claude Sonnet 5 next to Claude Opus 4.8 and Claude Haiku 4.5 in the current model comparison, with a 1M token context window and 128k max output for the synchronous Messages API. The same docs note that Sonnet 5 supports up to 300k output tokens through the Message Batches API with the relevant beta header.

Anthropic's Claude Sonnet product page now describes Sonnet as a hybrid reasoning model for real-time agents and high-volume work, and the page's latest update card links directly to Sonnet 5.

The practical API facts

Here is the developer-facing checklist as of the launch:

Item	Current value
Model ID	`claude-sonnet-5`
Context window	1M tokens
Max output, synchronous Messages API	128k tokens
Introductory API price	$2 / million input tokens and $10 / million output tokens
Intro pricing window	Through August 31, 2026
Standard API price after intro window	$3 / million input tokens and $15 / million output tokens
Recommended comparison model	Claude Opus 4.8 for higher-accuracy agentic and cyber-sensitive work

The introductory pricing is important. At $2 / $10 per million tokens until August 31, Sonnet 5 can be tested in real workloads without the same cost pressure as Opus-class models. After the introductory window, the platform docs show standard pricing at $3 / $15 per million tokens.

That means a migration test should not only compare quality. It should also compare tokenization, effort settings, retries, and the number of tool calls needed to finish a task. A model that needs fewer loops can be cheaper even if the nominal per-token price is higher than an older baseline.

Why Sonnet 5 is bigger than another model refresh

Sonnet has been the practical Claude line for a while: strong enough for production, cheaper than Opus, fast enough for interactive workflows. The problem was that the hardest agent tasks still often pushed teams toward Opus. Long debugging sessions, browser workflows, messy repository changes, and multi-step business processes tend to punish models that lose track of the plan.

Sonnet 5 is Anthropic's attempt to move that execution layer down into the Sonnet tier. The launch post says Sonnet 5 is much stronger than Sonnet 4.6 on reasoning, tool use, coding, and knowledge work. It also says performance is close to Opus 4.8 at lower prices.

Read that carefully. Anthropic is not saying Sonnet 5 replaces Opus 4.8 everywhere. In the same post, Opus 4.8 remains the higher-accuracy choice on demanding agentic search and computer-use tasks. The more useful interpretation is this: Sonnet 5 becomes the first model many teams should try before escalating to Opus.

For production systems, that suggests a three-layer routing pattern:

Start most user-facing and coding-agent work on Sonnet 5.
Escalate to Opus 4.8 when the task needs maximum accuracy, high-stakes reasoning, or lower guardrails for approved cybersecurity work.
Keep Haiku-class models for high-volume extraction, classification, and simple routing.

This is especially relevant for teams that already built model-routing infrastructure. If your application can switch models by config, Sonnet 5 is not a rewrite. It is a new default lane to benchmark.

What to test first

Do not migrate by running a few prompts in the console and calling it done. Sonnet 5 is being marketed around agents, and agent quality only shows up under stateful pressure. Start with real traces.

1. Brownfield coding tasks

Use issues from your own repository, not toy coding prompts. Good test cases include flaky tests, small refactors that touch several files, dependency upgrade fixes, and bugs where the first stack trace is misleading.

Score the model on whether it finds the real cause, writes or updates tests, keeps the patch small, and explains what changed without inventing extra work. Anthropic's launch materials highlight exactly this kind of sustained coding and debugging behavior.

2. Tool-use loops

Run Sonnet 5 through the tools your agents actually use: shell commands, browser automation, database queries, internal search, ticket systems, and file editing. The question is not whether the model can call a tool. The question is whether it recovers when a tool returns an unexpected result.

Watch for two failure modes: premature success and tool thrashing. A good agent should verify before finishing, but it should not keep calling tools after the answer is already known.

3. Browser and computer use

Anthropic calls out browser and computer-use work repeatedly in the Sonnet 5 launch. If you have workflows like competitive research, procurement, onboarding, form filling, or admin-console operations, build a small replay suite.

The best metric is task completion with clean state. Did the model finish the workflow, leave the system in the expected state, and avoid random side effects?

4. Long-context retrieval

The 1M context window is useful, but it is not magic. Test whether Sonnet 5 can find the right facts in a large policy corpus, customer archive, codebase, or meeting-history dump. Measure both recall and citation quality.

Large context can hide sloppy retrieval. If the answer is right but cannot point to the relevant section, your user still has to do the verification work.

5. Cost per finished task

Do not compare only price per million tokens. For agentic workloads, compare cost per completed task. Include retries, tool calls, cache reads, failed runs, and human review time.

Sonnet 5 may be more expensive than some older Sonnet calls after the introductory window, but if it finishes in fewer turns, it can still reduce total cost.

Migration notes for Claude API users

For direct Claude API users, the minimal change is the model name:

import Anthropic from "@anthropic-ai/sdk";

const anthropic = new Anthropic({
  apiKey: process.env.ANTHROPIC_API_KEY,
});

const message = await anthropic.messages.create({
  model: "claude-sonnet-5",
  max_tokens: 2048,
  messages: [
    {
      role: "user",
      content: "Review this migration plan and list the highest-risk assumptions.",
    },
  ],
});

console.log(message.content);

That is the easy part. The careful part is migration control:

Put claude-sonnet-5 behind a model flag, not a hard-coded constant spread across the codebase.
Keep Sonnet 4.6 and Opus 4.8 available during the first rollout window.
Log model, effort setting, prompt version, token counts, tool calls, latency, and final task outcome.
Run side-by-side evals before changing production defaults.
Watch for tokenizer differences. Anthropic notes that Sonnet 5 uses an updated tokenizer, so the same text can map to a different token count than older Sonnet models.

If you use a gateway or an OpenAI-compatible abstraction, the same principle applies: centralize model selection and treat Sonnet 5 as a rollout, not a text replacement.

Safety and cybersecurity caveats

The Claude Sonnet 5 System Card is worth reading before using the model in high-risk domains. Anthropic says Sonnet 5 improved over Sonnet 4.6 on several agentic safety measures, including resistance to malicious requests and prompt-injection hijacking. It also reports lower hallucination and sycophancy rates than Sonnet 4.6.

At the same time, stronger general capability can raise the ceiling on sensitive workflows. Anthropic says it did not deliberately train Sonnet 5 for cybersecurity tasks, but the model is stronger than Sonnet 4.6 on some security-relevant evaluations. It launched with cyber safeguards enabled by default.

For normal developer workflows, that is a good tradeoff. For approved security research, red-team automation, or vulnerability analysis where guardrail behavior affects the task, compare Sonnet 5 with Opus 4.8 and read the system card before choosing a default.

Should you switch now?

For most teams building with Claude, yes: Sonnet 5 deserves an immediate evaluation. It is not automatically the right production default, but it is the new baseline to test for agentic coding, browser workflows, research assistants, and internal operations agents.

The most likely early winners are:

coding agents that need to finish multi-file work without losing the thread;
browser agents that need better planning and recovery;
research workflows that combine long context with tool use;
customer-facing assistants where Opus is too expensive but older Sonnet models were not reliable enough;
enterprise automations where the model has to take action across multiple systems.

The most likely holdouts are tasks where Opus 4.8's extra accuracy is worth the cost, or domains where the safety profile matters more than raw throughput.

Bottom line

Claude Sonnet 5 is the release that makes the Sonnet tier feel less like the cheaper compromise and more like the default execution layer for real agents. The model ID is live, the API docs are updated, the context window is 1M tokens, and the launch pricing gives teams a short window to run serious production evals before standard pricing begins.

Do not just ask whether Sonnet 5 is smarter than Sonnet 4.6. Ask whether it completes more work per dollar, with fewer retries, cleaner tool use, and less human cleanup. That is the benchmark that matters.