VFF - The signal in the noise
News

Anthropic releases Opus 4.8. What's in it?

Nick ZarzyckiRead original
Share
Anthropic releases Opus 4.8.  What's in it?

Claude Opus 4.8 is Anthropic’s new upgrade to the Opus model line, released on May 28, 2026

Claude Opus 4.8 is Anthropic’s new upgrade to the Opus model line, released today, May 28, 2026. It builds on Opus 4.7 with stronger benchmark performance, better collaboration behavior, improved reliability on agentic work, and the same regular pricing as Opus 4.7. Anthropic positions it less as a dramatic new generation and more as a meaningful refinement: sharper judgment, better honesty, stronger long-running task performance, and better efficiency in certain workflows.

Core positioning

The biggest theme is Claude as a more reliable collaborator, especially for coding, agentic workflows, legal work, finance, research, and professional document-heavy tasks.

Anthropic says Opus 4.8 improves across benchmarks covering coding, agentic skills, reasoning, and practical knowledge work. The article emphasizes that 4.8 is not only more capable, but also better at knowing when it is uncertain, catching mistakes, and avoiding unsupported claims.

Simply put: Opus 4.8 is designed to be less of a “smart autocomplete” and more of a dependable AI teammate that can work through large, multi-step tasks with fewer bad assumptions.

Main improvements

1. Better judgment and collaboration

Early testers describe Opus 4.8 as better at asking clarifying questions, catching its own mistakes, pushing back on weak plans, and building confidence before making major changes. This is especially important in Claude Code, where the model may be working across a complex codebase or multiple services.

That matters because a lot of AI coding tools fail not because they cannot write code, but because they make premature assumptions, touch the wrong files, or “complete” work without really validating it. Anthropic is clearly trying to make Opus 4.8 better at sustained, careful execution.

2. Stronger agentic performance

Several customer quotes focus on agent workflows. Genspark says Opus 4.8 was the only model on its Super-Agent benchmark to complete every case end-to-end, beating prior Opus models and matching GPT-5.5 at cost parity. Cursor says Opus 4.8 exceeds prior Opus models across every effort level and uses tools more efficiently.

This is probably the most important angle: Opus 4.8 is built for agents. Not just chat. Not just one-shot answers. It is aimed at workflows where the model plans, uses tools, checks outputs, and keeps going.

3. Better honesty

Anthropic calls honesty one of the most prominent improvements. The company says Opus 4.8 is more likely to flag uncertainty and less likely to make unsupported claims. Their evaluations show it is roughly four times less likely than Opus 4.7 to let flaws in its own code pass without comment.

This is a big deal for real-world use. In business, legal, financial, coding, and research workflows, the danger is not just a wrong answer. The danger is a wrong answer delivered confidently. Anthropic is saying 4.8 is better at saying, essentially: “I’m not sure,” “this part may be flawed,” or “the evidence is not strong enough.”

4. Lower misaligned behavior

Anthropic says its alignment team found that Opus 4.8 reaches new highs on “prosocial traits,” including supporting user autonomy and acting in the user’s best interest. They also say its rates of misaligned behavior, such as deception or cooperation with misuse, are substantially lower than Opus 4.7 and similar to Claude Mythos Preview, which Anthropic describes as its best-aligned model.

The practical takeaway: Anthropic is positioning 4.8 not just as smarter, but safer and more trustworthy in high-stakes workflows.

New features launching with Opus 4.8

Dynamic workflows in Claude Code

The biggest product feature is dynamic workflows, available in research preview. This lets Claude Code plan a large job, spin up hundreds of parallel subagents in one session, let those agents run longer with Opus 4.8, and then verify the outputs before reporting back.

Anthropic gives the example of codebase-scale migrations across hundreds of thousands of lines of code, from kickoff to merge, using the test suite as the validation bar. This is aimed at big engineering tasks that are too large for a single linear agent loop.

That is a major signal about where Anthropic is going: parallel AI workforces inside developer tools.

Effort control in Claude.ai and Claude Cowork

Users can now choose how much effort Claude puts into a response. Higher effort means Claude thinks more deeply and more often, improving quality on harder tasks. Lower effort means faster responses and slower consumption of rate limits. Anthropic says this effort control is available on all plans.

This is useful because not every prompt deserves maximum reasoning. A quick rewrite, summary, or small code change may not need deep thinking. A migration plan, legal analysis, architecture review, or debugging session probably does.

Messages API change

The Messages API now accepts system entries inside the messages array. Developers can update Claude’s instructions mid-task without breaking the prompt cache or forcing the update through a user message. Anthropic says this can be used to update permissions, token budgets, or environment context while an agent is running.

For agent builders, this is meaningful. It makes long-running agent workflows easier to control because the system can adjust operating instructions during execution.

Effort defaults and token usage

Opus 4.8 defaults to high effort, which Anthropic says is the best balance of quality and user experience. On coding tasks, this default uses a similar number of tokens as Opus 4.7’s default, but with better performance. Users can also choose “extra,” called xhigh in Claude Code, or “max” for harder tasks. Anthropic recommends “extra” for difficult tasks and long-running asynchronous workflows.

Anthropic also says it has increased Claude Code rate limits to accommodate the higher token usage of higher effort settings.

Pricing and availability

Claude Opus 4.8 is available everywhere now. Regular pricing is unchanged from Opus 4.7: $5 per million input tokens and $25 per million output tokens. Fast mode pricing is $10 per million input tokens and $50 per million output tokens. Developers can use it through the Claude API as claude-opus-4-8.

Anthropic also says fast mode for Opus 4.8 works at 2.5x speed and is now three times cheaper than fast mode was for previous models.

What Anthropic says is next

Anthropic says Opus 4.8 is a “modest but tangible” improvement over Opus 4.7. The company is also working on lower-cost models that provide many of the same capabilities as Opus. More importantly, it says it plans to release a new class of model with higher intelligence than Opus. As part of Project Glasswing, a small group of organizations is already using Claude Mythos Preview for cybersecurity work, and Anthropic says Mythos-class models may become available to all customers in the coming weeks once stronger cyber safeguards are ready.

Bottom line

The most important improvements for Claude Opus 4.8 are:

  1. Better judgment during complex work

  2. Stronger agentic and coding performance

  3. More honest self-assessment

  4. Fewer unsupported claims

  5. Better long-running workflow support

  6. User-controlled reasoning effort

  7. Dynamic parallel subagents in Claude Code

  8. Same regular pricing as Opus 4.7

For builders, the message is: Claude is moving deeper into autonomous work, especially software engineering and enterprise agent workflows. Anthropic is making Claude more capable of managing large, delegated work with less supervision.

We will see more releases framed this way as the market matures. If you found this useful, sign up for my newsletter on the homepage for more articles like it.

Share

Our Briefing

Weekly signal. No noise. Built for founders, operators, and AI-curious professionals.

No spam. Unsubscribe any time.