Anthropic has built AI systems that independently find and fix software flaws, including serious zero-day vulnerabilities. Claude Mythos is so advanced that its public release is limited to prevent misuse.  

Key developments include:  

Claude Mythos (High Risk/High Capability) 

  • Performance: In controlled evaluations, Claude Mythos identified thousands of security vulnerabilities in major operating systems and web browsers, including undiscovered flaws persisting for over 25 years.  
  • Autonomy: the model autonomously chains multiple exploit types, such as JIT heap spraying and sandbox escapes, into a single exploit. To achieve system-level access without human intervention.  
  • Access restriction: because it could be used for serious cyber attacks, Anthropic is not making its thoughts available to the public.  
  • Anthropic shares Mythos with select companies, such as Google and Apple, to enhance security.  

Claude Code Security (Production Tool) 

  • This tool, now in research preview, scans code for security issues and suggests fixes.  
  • It targets subtle context-dependent vulnerabilities such as business logic errors that are frequently overlooked by conventional static or dynamic analysis tools.  
  • The tool reviews pull requests, flags bugs before code merges, and shares summary comments identifying code issues.  

Impact and Security Risks 

Claude found 22 vulnerabilities in Firefox with Mozilla in two weeks, nearly a fifth of high-severity bugs fixed in 2025. Anthropic says this tool helps companies fix bugs faster and at scale, outpacing human teams. Anthropic warned that attackers could use the tool to exploit zero‑day vulnerabilities, thereby restricting access to those Mythos.  

Anthropic’s code review feature finds bugs in software before code is merged and is now part of their coding platform. Claude Code is available as a beta research preview for team and enterprise users.  

AI Agents to Review Code Changes 

Code review checks pull requests, which are how developers submit and review code changes before adding them to the main project.  

Anthropic says the tool uses multiple AI agents simultaneously to review code changes, spot potential bugs, and eliminate false positives. The results are shared in a single summary comment on the pull request, along with additional comments indicating severity. Red for critical, yellow for concerns, and purple for existing bugs.  

Designed to address growing delays in code reviews, Anthropic built this tool to keep pace as AI speeds up development.  

Code review slows development; customers report similar issues. Developers are overextended, so many pull requests only get brief reviews. Cat Wu, head of product, says the feature targets logic errors and offers actionable feedback, addressing frequent criticism of prior AI tools.  

How The System Works 

Upon a pull request, the AI orchestrates several agents to simultaneously inspect the code base from diverse technical perspectives. The coordinating agent then aggregates ranks by severity and de-duplicates the findings for final delivery.  

The system explains its reasoning step by step, showing what the issue is, why it matters, and how it could be fixed.  

Anthropic said the extent of analysis scales with the size of the code update; large or complex changes receive more extensive reviews while smaller updates undergo lighter checks. On average, a review takes around 20 minutes.  

84% of large pull requests had issues (average 7.5 per pull request). 31% of small ones have issues (average 0.5). Engineers usually agreed with the findings; fewer than 1% were found to be incorrect.  

Code review uses a token pricing model, typically costing $15–$25 per pull request. Admins can set monthly spending limits, adjust repositories, and monitor review activity and costs via dashboards.  

Source: Anthropic launches AI-powered Code Review tool to detect bugs in pull requests 

Amazon

Leave a Reply

Your email address will not be published. Required fields are marked *