AIOctober 10, 202530 views

Claude Beats GPT-5 in Real-World Work Tasks, According to OpenAI Study

A surprising new study from OpenAI itself reveals that Claude outperformed GPT-5 in practical workplace scenarios.

Test Results: Claude Takes the Lead

The study evaluated AI models on real-world business tasks, including:

Responding to unhappy customer emails
Optimizing table layouts
Auditing prices

Performance Rankings:

Claude Opus 4.1 - 47.6% accuracy ✓
GPT-5 high - 38.8%
o3 high - 34.1%
Gemini 2.5 Pro - 25.5%
Grok 4 - 24.3%
GPT-4o - 12.4% (worst performance)

Where Claude Excels

Claude demonstrated superior performance in:

Public services
Healthcare
Social assistance tasks

The Bottom Line

This OpenAI study confirms that Claude Opus 4.1 currently leads in practical workplace applications, with nearly 50% accuracy in real-world tasks—significantly outperforming even OpenAI's own GPT-5.

Source: TechRadar

Claude Beats GPT-5 in Real-World Work Tasks, According to OpenAI Study

Test Results: Claude Takes the Lead

Where Claude Excels

The Bottom Line

More

AI Infrastructure Investment Analysis Reveals Questionable Returns

OpenAI Launches Sora 2: Enhanced AI Video Generation with Realistic Physics and Audio