OpenAI’s new benchmark shows Claude and GPT-5 matching human experts at real work tasks. The worst part? Models improved 300% in just 15 months.
Source link
OpenAI’s new benchmark shows Claude and GPT-5 matching human experts at real work tasks. The worst part? Models improved 300% in just 15 months.
Source link