Saudi Press

Saudi Arabia and the world
Thursday, Sep 18, 2025

OpenAI's o3 AI model attains human-level performance on a general intelligence exam.

OpenAI's o3 AI model attains human-level performance on a general intelligence exam.

OpenAI's o3 AI model reaches a significant milestone, attaining human-level performance on the ARC-AGI benchmark, fueling discussions on the possibilities of artificial general intelligence.
In a notable advancement, OpenAI's o3 system has reached human-level performance on a test assessing general intelligence.

On December 20, 2024, o3 achieved a score of 85% on the ARC-AGI benchmark, surpassing the previous AI best of 55% and equaling the average human score.

This represents a pivotal moment in the quest for artificial general intelligence (AGI), as the o3 system excelled in tasks that evaluate an AI's ability to adapt to new situations with limited data, an essential aspect of intelligence.

The ARC-AGI benchmark measures AI's "sample efficiency"—its capacity to learn from few examples—and is considered a crucial step toward AGI.

Unlike systems such as GPT-4, which depend on extensive datasets, o3 seems to thrive in environments with minimal training data, a significant challenge in AI development.

Although OpenAI has not fully revealed the technical specifics, o3's success might be due to its ability to detect "weak rules" or simpler patterns that can be generalized to address new problems.

The model likely explores various "chains of thought," choosing the most effective strategy based on heuristics or basic rules.

This approach is similar to methods used by systems like Google's AlphaGo, which utilizes heuristic decision-making to play the game of Go.

Despite the encouraging results, many questions remain about whether o3 truly signifies a step toward AGI.

There is speculation that the system might still depend on language-based learning instead of genuinely generalized cognitive abilities.

As OpenAI releases more information, the AI community will require further testing to evaluate o3's true adaptability and whether it can replicate the flexibility of human intelligence.

The implications of o3's performance are substantial, especially if it proves to be as adaptable as humans.

It could pave the way for an era of advanced AI systems capable of addressing a diverse array of complex tasks.

However, fully understanding its capabilities will necessitate more evaluations, leading to new benchmarks and considerations for governing AGI.
Newsletter

Related Articles

Saudi Press
0:00
0:00
Close
Trump and Starmer Clash Over UK Recognition of Palestinian State Amid State Visit
Saudi Arabia cracks down on music ‘lounges’ after conservative backlash
Saudi Arabia Signs ‘Strategic Mutual Defence’ Pact with Pakistan, Marking First Arab State to Gain Indirect Access to Nuclear Strike Capabilities in the Region
Sam Altman sells the 'Wedding Estate' in Hawaii for 49 million dollars
Turkish car manufacturer Togg Enters German Market with 5-Star Electric Sedan and SUV to Challenge European EV Brands
World’s Longest Direct Flight China Eastern to Launch 29-Hour Shanghai–Buenos Aires Direct Flight via Auckland in December
New OpenAI Study Finds Majority of ChatGPT Use Is Personal, Not Professional
Kuwait opens bidding for construction of three cities to ease housing crunch.
This Week in AI: Meta’s Superintelligence Push, xAI’s Ten Billion-Dollar Raise, Genesis AI’s Robotics Ambitions, Microsoft Restructuring, Amazon’s Million-Robot Milestone, and Google’s AlphaGenome Update
Indian Student Engineers Propose “Project REBIRTH” to Protect Aircraft from Crashes Using AI, Airbags and Smart Materials
Could AI Nursing Robots Help Healthcare Staffing Shortages?
Turkish authorities seize leading broadcaster amid fraud and tax investigation
Qatari prime minister says Netanyahu ‘killed any hope’ for Israeli hostages
Apple Introduces Ultra-Thin iPhone Air, Enhanced 17 Series and New Health-Focused Wearables
Big Oil Slashes Jobs and Investments Amid Prolonged Low Crude Prices
Social Media Access Curtailed in Turkey After CHP Calls for Rallies Following Police Blockade of Istanbul Headquarters
Did the Houthis disrupt the internet in the Middle East? Submarine cables cut in the Red Sea
Gold Could Reach Nearly $5,000 if Fed Independence Is Undermined, Goldman Sachs Warns
Uruguay, Colombia and Paraguay Secure Places at 2026 World Cup
Trump Administration Advances Plans to Rebrand Pentagon as Department of War Instead of the Fake Term Department of Defense
Tether Expands into Gold Sector with Profit-Driven Diversification
Trump’s New War – and the ‘Drug Tyrant’ Fearing Invasion: ‘1,200 Missiles Aimed at Us’
At the Parade in China: Laser Weapons, 'Eagle Strike,' and a Missile Capable of 'Striking Anywhere in the World'
Information Warfare in the Age of AI: How Language Models Become Targets and Tools
Israeli Airstrike in Yemen Kills Houthi Prime Minister
After the Shock of Defeat, Iranians Yearn for Change
YouTube Altered Content by Artificial Intelligence – Without Permission
Iran Faces Escalating Water Crisis as Protests Spread
More Than Half a Million Evacuated as Typhoon Kajiki Heads for Vietnam
HSBC Switzerland Ends Relationships with Over 1,000 Clients from Saudi Arabia, Lebanon, Qatar, and Egypt
Sharia Law Made Legally Binding in Austria Despite Warnings Over 'Incompatible' Values
Dogfights in the Skies: Airbus on Track to Overtake Boeing and Claim Aviation Supremacy
Tim Cook Promises an AI Revolution at Apple: "One of the Most Significant Technologies of Our Generation"
Are AI Data Centres the Infrastructure of the Future or the Next Crisis?
Miles Worth Billions: How Airlines Generate Huge Profits
Zelenskyy Returns to White House Flanked by European Allies as Trump Pressures Land-Swap Deal with Putin
Beijing is moving into gold and other assets, diversifying away from the dollar
Cristiano Ronaldo Makes Surprise Stop at New Hong Kong Museum
Zelenskyy to Visit Washington after Trump–Putin Summit Yields No Agreement
High-Stakes Trump-Putin Summit on Ukraine Underway in Alaska
Iranian Protection Offers Chinese Vehicle Shipments a Cost Advantage over Japanese and Korean Makers
Saudi Arabia accelerates renewables to curb domestic oil use
Cristiano Ronaldo and Georgina Rodríguez announce engagement
Asia-Pacific dominates world’s busiest flight routes, with South Korea’s Jeju–Seoul corridor leading global rankings
Private Welsh island with 19th-century fort listed for sale at over £3 million
Sam Altman challenges Elon Musk with plans for Neuralink rival
Australia to Recognize the State of Palestine at UN Assembly
The Collapse of the Programmer Dream: AI Experts Now the Real High-Earners
Armenia and Azerbaijan to Sign US-Brokered Framework Agreement for Nakhchivan Corridor
British Labour Government Utilizes Counter-Terrorism Tools for Social Media Monitoring Against Legitimate Critics
×