Antrhopic’s Claude 3 Outperforms GPT-4 and Gemini Ultra in Benchmark Tests

On Monday, Anthropic, a big name in the world of AI startups introduced its Claude 3 series of artificial intelligence models. The Claude 3 suite includes Opus, Sonnet, and the upcoming Haiku which are designed to be AI systems that you can rely on. Among these, Opus is their flagship model and reportedly the most advanced one surpassing OpenAI’s GPT-4 and Google’s Gemini Ultra in benchmark tests.

Changing The AI Game With Claude 3

While Claude 3 has three different models, the star of the lineup Opus is more capable than any other AI system currently on the market. According to Anthropic cofounder and CEO Dario Amodei; “Opus is capable of the widest range of tasks and performs them exceptionally well.

Claude 3 benchmarks | Anthropic

On top of that Amodei also explains how Opus outstrips top AI models like GPT-4, GPT-3.5, and Gemini Ultra on academic benchmarks like GSM-8k for basic mathematics, MMLU for undergraduate-level expert knowledge and GPQA for graduate-level expert reasoning. Amodei says “It seems to outperform everyone and get scores that we haven’t seen before on some tasks.

Despite not fully disclosing how powerful their AI model Opus is, the reported benchmarks are a clear giveaway that Opus is head-to-head or even above major alternatives like GPT-4 and Gemini Ultra in complex tasks that require advanced reasoning. This establishes Opus at least on paper as a new high standard for commercially accessible conversational AI.

Rest of the Lineup

Apart from Opus, Claude 3’s Sonnet is the mid-range model that offers businesses a cost-effective option to perform routine tasks like data analysis, knowledge work, and more. Meanwhile, Haiku is designed to be fast and affordable, ideal for uses like customer chatbots where speed and cost matter most. Although Haiku is not publicly out yet, the startups expect it to launch “in a matter of weeks, not months.

The rest of the AI models | Anthropic

The Visual Improvements

The Claude 3 AI models are not only efficient in text-based queries but also showcase extraordinary computer vision capabilities on par with other models. This gives users the option to extract text from images, documents, charts, diagrams, and a lot more.

A lot of [customer] data is either highly unstructured or in some sort of visual format. Just the process of having to manually copy that information to even be able to have it interact with a generative AI tool is quite cumbersome.

Daniela Amodei

The Future of AI

Anthropic’s launch of the Claude 3 lineup is a testimonial to how companies are pushing boundaries to introduce models like Opus capable of outperforming big names like GPT-4 and Gemini Ultra. This is all we have for now, let us know your thoughts on Anthropic’s Claude 3 in the comments down below!

ABOUT THE AUTHOR

Hamid Murtaza


Whether it’s troubleshooting technical issues or breaking down the Internet culture, Hamid is there to make it simple for his readers. With a deep passion for writing, Hamid loves to explore different ways to convey ideas using his words. When not problem-solving, you can find him making streaks on Duolingo.