In brief
- OpenAI has introduced new internal tests for project-level performance.
- Scientific and mathematical benchmarks showed higher scores than prior models.
- The announcement comes as OpenAI makes deals to integrate GPT in the U.S. Government and Corporations.
Just weeks after its last major release, OpenAI is aggressively pivoting its flagship ChatGPT from a consumer novelty to an indispensable corporate powerhouse.
On Thursday, the company released GPT-5.2, a new large language model it claims is faster, more reliable, and designed to handle complex professional workflows.
The update signal OpenAI is moving beyond homework help and general queries, aiming instead to embed its technology as an essential, daily tool in the business world, as evidenced by its lucrative deals with the U.S. government and Disney.
“We designed GPT‑5.2 to unlock even more economic value for people,” OpenAI said in a statement. “It’s better at creating spreadsheets, building presentations, writing code, perceiving images, understanding long contexts, using tools, and handling complex, multi-step projects.”
The new benchmark for workplace automation
Touting the performance of GPT-5.2, the company introduced a proprietary evaluation benchmark, GDPval, that simulates tasks across 44 occupations.
GPT-5.2 matched or exceeded human worker performance in approximately 71% of the comparisons, the company claims.
“On GDPval, the thinking model beats or ties human experts on 70.9% of common professional tasks like spreadsheets, presentations, and document creation,” OpenAI CEO of Applications, Fidji Simo wrote on X. “It’s also better at general intelligence, writing code, tool calling, vision, and long-context understanding so it can unlock even more economic value for people.”
It is unclear whether the benchmark has undergone external review, leaving industry experts to wait for independent verification of the claims.
Technical breakdown: Three models for three jobs
GPT-5.2 became available across paid subscription tiers on Thursday, with API access opening the same day. Developers can now choose from three distinct versions, each optimized for different professional needs.
- Instant: For quick, simple professional tasks.
- Thinking: For more complex, multi-step tasks.
- Pro: The top-tier model, built for intensive research and long-form projects.
API pricing has been set at $1.75 per million input tokens and $14 per million output tokens.
In addition to the GDPval benchmark, GPT-5.2 showed improved performance on established technical tests, posting higher scores on GPQA Diamond and FrontierMath. It also reportedly demonstrated more reliable results in demanding tasks like coding, data analysis, and experimental design.
In the announcement, the company presented several glowing feedback statements from early testers.
The release of a more competent workplace AI arrives in an already tense labor environment.
Corporate executives appear largely optimistic, with a recent Just Capital survey showing 93% of business leaders view AI as a positive force. Yet, the same study found nearly half of Americans expect the technology to eliminate jobs, a concern executives reportedly share less.
Generally Intelligent Newsletter
A weekly AI journey narrated by Gen, a generative AI model.
