A surprising new study from OpenAI itself reveals that Claude outperformed GPT-5 in practical workplace scenarios.
Test Results: Claude Takes the Lead
The study evaluated AI models on real-world business tasks, including:
- Responding to unhappy customer emails
- Optimizing table layouts
- Auditing prices
- Claude Opus 4.1 - 47.6% accuracy ✓
- GPT-5 high - 38.8%
- o3 high - 34.1%
- Gemini 2.5 Pro - 25.5%
- Grok 4 - 24.3%
- GPT-4o - 12.4% (worst performance)
Where Claude Excels
Claude demonstrated superior performance in:
- Public services
- Healthcare
- Social assistance tasks
The Bottom Line
This OpenAI study confirms that Claude Opus 4.1 currently leads in practical workplace applications, with nearly 50% accuracy in real-world tasks—significantly outperforming even OpenAI's own GPT-5.
Source: TechRadar

