Anthropic says its latest models can give GPT-4 and Gemini a run for their money

Published on

March 5th, 2024

Author

Aaron O'Leary

Category

News

💌 Join 500K+ subscribers who get the best of tech every day right to their inbox

Share On

It’s been an eventful week in the AI world. We’ve seen a new lightweight suite of LLMs from Google, a new AI model built by a Microsoft-backed company, and now another company has something to say.

Anthropic announced Claude 3 — a suite of AI models the company claims are its fastest and most powerful yet.

Claude 3 comes in three flavors: Claude 3 Opus, Claude 3 Sonnet, and Claude 3 Haiku, with Opus being the largest, most expensive model and Haiku being the smallest and cheapest.

Anthropic claims that Claude 3 will answer more questions, understand more context, and provide more accurate results when compared to the company's previous launches.

Opus and Sonnet are available now to test out on claude.ai and through the company's API. Haiku, the smallest of the bunch, will arrive a little bit later, and all three can be deployed on applications like chatbots, auto-completion, and data-extraction tools.

How good is it?

Unlike the company’s earlier models, Claude 3 is multimodal, so it understands and interacts with text and photo user inputs, and it seems Claude 3 is outperforming GPT-4 across several benchmarks.

Take Claude Opus as an example. According to the company, it showed better graduate-level understanding than GPT-4, scoring 50.4 over GPT-4’s 35.7.

Even the smallest of the bunch, Haiku, is showing some impressive results, with the company saying in a blog post that it can “read a dense research paper complete with charts and graphs in less than three seconds.”

Size doesn't always matter

As mentioned above, this isn't the only big AI launch over the past few weeks. Google also launched Gemma, a suite of lightweight, open-source AI models that make up for their lack of size by being fast.

Now, with Anthropic putting more effort into building models that can handle more lightweight tasks at a quicker pace, it begs the question of whether one of the next big trends we see is AI companies putting more resources into lightweight models that ditch some of the more mind-blowing, data-heavy abilities to instead focus on executing more remedial tasks at a lightning-quick pace.

What next? 

The big question is how OpenAI responds. For some time now, the company's flagship model, GPT-4, has been the standard to reach when it comes to AI models, but as time goes on, it's only natural that competitors start to swallow up more of that pie.

The company has been pretty tight-lipped about when GPT-5 is coming, but as more and more companies step up to offer an alternative, the spotlight shifts to the darling AI company to see what it has under its sleeve.

Comments (14)

Jess Smith @jesssmith

Comparing Claude 3 to the company's earlier releases, Anthropic asserts that it will deliver more accurate findings, comprehend more context, and respond to more queries. See: http://truckaccidentattorneycolo...

May 16

alijen @alijen

it seems Claude 3 is outperforming GPT-4 across several benchmarks. color blind test

Apr 2

Viscanzo Smith @vaincansosmih

Although AI helps a lot but few sector in development it is still complex for AI like when you talk about cross sectional area of a wire formula and development of electronics AI can't do that currently.

Apr 4

Viscanzo Smith @vaincansosmih

Many people says that AI will takeover health section too. But i have to say its not near time but time will comes when it takes over this thing too from complex surgeries to therapy. Even therapist in islamabad somehow use chatgpt to get complex question even they have to do it by self.

Jul 19

Arjun Rajkumar @arjunrajkumar

ElephantPath

I agree .. Used to be a loyal ChatGPT user, but have now shifted to Claude for everything. I pay $20/month, and the expereince of using it has been much better than ChatGPT. Mainly use it for programming help while building https://getprogressupdates.com, and ocassionaly to ask random life/parneting questions. Very happy with it.

Nov 29