September, 26, 2025-04:26
Share: Facebook | Twitter | Whatsapp | Linkedin | Visits: 37706 | :2821
OpenAI says GPT-5 rivals human professionals in new benchmark
OpenAI has released a new benchmark, GDPval, designed to measure how its AI systems perform against human professionals across major industries. The goal: to track progress toward artificial general intelligence (AGI) by testing how well AI handles economically valuable work.
In its initial results, OpenAI found that GPT-5 — along with Anthropic’s Claude Opus 4.1 — is already approaching the quality of work delivered by industry experts.
GDPval evaluates AI performance across nine industries that make up the largest share of U.S. GDP, including healthcare, finance, government, and manufacturing. Within those sectors, the test spans 44 occupations, from nurses and software engineers to journalists.
For the first version, GDPval-v0, OpenAI asked professionals to compare AI-generated reports with those written by peers and pick the stronger output. One example had investment bankers create a competitor analysis for the last-mile delivery market, with evaluators ranking the AI and human versions side by side.
Results show:
GPT-5-high — a more compute-intensive variant of GPT-5 — matched or outperformed experts in 40.6% of cases.
Claude Opus 4.1 performed even better, with a 49% win rate. OpenAI suggested Claude’s advantage may stem partly from its visually appealing report formatting rather than pure analytical strength.
OpenAI cautioned that GDPval still covers only a narrow slice of real-world job tasks. While it highlights rapid progress, the company stressed that its models are not ready to replace human workers. Instead, it sees GDPval as an early framework for tracking AI’s trajectory toward its long-term mission of developing AGI.
Author: Kandi Srinivasa Reddy, Srinivasa Reddy Kandi, #KandiSrinivasaReddy, #SrinivasaReddyKandi
Will Trump have unilateral power or just pretend he does?
The man accused of murdering BBC star John Hunt's wife and two daughters was accused of the rape of one of his victims today.
Chelsea manager Enzo Maresca has acknowledged the club's summer acquisitions may face an early exit from Chelsea in January
Corporate Britain is poised for a significant surge in takeover
Imperative Nature of Cloud Analytics
How EMC consultation services assist clients in implementing cutting-edge information systems?
Why Machine and Artificial Intelligence The Leading Technology?
Is really vegetarian diets do lower your cholesterol
Chelsea Manager Maresca Hints at Potential January Exit for Kiernan Dewsbury-Hall
How Oracle ERP solutions act as a top-class technology ?
Trump to give America's tallest mountain new name
Essential Significance of Cloud Analytics
Manufacturing Strategy
Richard Osman has disclosed the unexpected reason behind his departure from the popular show Child Genius
Is SAP solutions offer diverse range of services?
Farmers Dog Pub Struggles with Rising Operating Expenses