San Francisco, June 24, 2024 — AI development company Anthropic today announced the release of its new AI model, Claude 3.5 Sonnet. This latest model outperforms its predecessor, Claude 3 Opus, and other competitor models, offering superior performance across various evaluations while maintaining the speed and cost of the mid-tier Claude 3 Sonnet.
Claude 3.5 Sonnet is now available for free on Claude.ai and the Claude iOS app. Subscribers to the Claude Pro and Team plans can access it with significantly higher rate limits. It is also available through the Anthropic API, Amazon Bedrock, and Google Cloud’s Vertex AI. The model is priced at $3 per million input tokens and $15 per million output tokens, featuring a 200K token context window.
Performance and Enhanced Features
Claude 3.5 Sonnet sets new industry benchmarks for graduate-level reasoning (GPQA), undergraduate-level knowledge (MMLU), and coding proficiency (HumanEval). It shows significant improvements in understanding nuance, humor, and complex instructions, excelling in writing high-quality content with a natural, relatable tone. Operating at twice the speed of Claude 3 Opus, Claude 3.5 Sonnet is ideal for complex tasks such as context-sensitive customer support and orchestrating multi-step workflows.
In Anthropic’s internal agentic coding evaluation, Claude 3.5 Sonnet solved 64% of problems, significantly outperforming Claude 3 Opus, which solved 38%. This evaluation tests the model’s ability to fix a bug or add functionality to an open-source codebase based on a natural language description of the desired improvement. Claude 3.5 Sonnet can independently write, edit, and execute code with sophisticated reasoning and troubleshooting capabilities, making it effective for updating legacy applications and migrating codebases.
New Feature: Artifacts
Starting today, users can access a new feature called Artifacts on Claude.ai. When a user requests content generation such as code snippets, text documents, or website designs, these Artifacts appear in a dedicated window alongside their conversation. This creates a dynamic workspace where users can view, edit, and build upon AI-generated content in real-time, seamlessly integrating it into their projects and workflows. This feature marks the evolution of Claude from a conversational AI to a collaborative work environment and will soon expand to support team collaboration.
Safety and Privacy
Anthropic has subjected its models to rigorous testing to minimize misuse. Despite Claude 3.5 Sonnet’s leap in intelligence, it remains at ASL-2 safety level. External experts have been engaged to test and refine the safety mechanisms within this latest model. For instance, Claude 3.5 Sonnet was provided to the UK’s Artificial Intelligence Safety Institute (UK AISI) for pre-deployment safety evaluation. The UK AISI completed tests and shared their results with the US AI Safety Institute (US AISI) as part of a partnership between the US and UK AISIs.
One of the core principles guiding Anthropic’s AI model development is privacy. The company does not train its generative models on user-submitted data unless explicitly permitted by the user. To date, no customer or user-submitted data has been used to train these models.
Future Plans
Anthropic aims to significantly improve the balance between intelligence, speed, and cost every few months. To complete the Claude 3.5 model family, Claude 3.5 Haiku and Claude 3.5 Opus will be released later this year. Additionally, the company is developing new modalities and features to support more business use cases, including integrations with enterprise applications. Features like Memory, which will enable Claude to remember user preferences and interaction history as specified, are also in development to enhance personalization and efficiency.
Anthropic welcomes feedback on Claude 3.5 Sonnet, which can be submitted directly within the product to inform the development roadmap and improve user experience. The company looks forward to seeing how users build, create, and discover with Claude.