Date: March 25, 2025
The Arc Prize Foundation has announced a new AGI test called ‘ARC-AGI-2’ with the purpose of measuring AI’s general fluid intelligence.
The test is designed to assign never-seen-before tasks to AI chatbots, which might be easier for humans, but for AIs, that’s not the case.
The test adopts formats from its predecessor, ARC-AGI-1. However, making the whole process more advanced significantly increases the signal strength, reflecting any AI model’s real fluid intelligence.
The ‘ARC-AGI-2’ model is designed to ensure that the systems being tested demonstrate high adaptability and efficiency.
What separates ARC-AGI from alternative benchmarks is the fact that while most benchmarks focus on testing ‘PHD++ Skills,’ this test takes an opposite approach.
As the official announcement states,
“Every ARC-AGI-2 task was solved by at least 2 humans in 2 attempts or less in a controlled study with hundreds of human participants. This matches the rules we hold for AI, which gets two attempts per task.”
François Chollet, co-founder of The Arc Prize Foundation, wrote on X,
“ARC-AGI-2 is fully human-calibrated. We tested these tasks with 400 people in live sessions, and we only kept tasks that could reliably be solved by multiple people. Each eval set (public, private, semi-private) has the exact same human difficulty – average people in our test sample achieve 60% with no prior training, and a panel of 10 people achieve 100%.”
Here are the results based on the official ARC-AGI Leaderboard.
| System | ARC-AGI-1 | ARC-AGI-2 | Efficiency (cost/task) |
|---|---|---|---|
| Human panel (at least 2 humans) | 98% | 100% | $17 |
| Human panel (average) | 64.20% | 60% | $17 |
| o3-low (CoT + Search/Synthesis) | 75.70% | 4%* | $200 |
| o1-pro (CoT + Search/Synthesis) | ~50% | 1%* | $200* |
| The ARChitects (Kaggle 2024 Winner) | 53.50% | 3% | $0.25 |
| o3-mini-high (Single CoT) | 35% | 0.00% | $0.41 |
| r1 and r1-zero (Single CoT) | 15.80% | 0.30% | $0.08 |
| gpt-4.5 (Pure LLM) | 10.30% | 0.00% | $0.29 |
ARC-AGI-2 tests AI fluid intelligence with novel visual puzzles, demanding adaptability and efficiency over brute force. Unlike ARC-AGI-1, it focuses on symbol interpretation, multi-rule reasoning, and context, using a 1,000-task training set and 120-task evaluation sets.
AI gets two attempts per task, yet top models like o3-low (4%) and o1-pro (1%) trail the human average of 60%. Tied to ARC Prize 2025, it pushes for 85% accuracy at $0.42 per task, aiming for true AGI.
The ARC Prize has made another return on Kaggle, starting this week. Developers achieving 85% accuracy while spending no more than $0.42 per task are eligible. This dual focus on high performance and low cost aims to drive innovation toward efficient, adaptable AI systems—key traits of artificial general intelligence (AGI).
The contest offers $1 million in prizes, including a $700K Grand Prize for the first team to hit the 85% threshold within Kaggle’s computing limits.
By Arpit Dubey
Arpit is a dreamer, wanderer, and tech nerd who loves to jot down tech musings and updates. With a knack for crafting compelling narratives, Arpit has a sharp specialization in everything: from Predictive Analytics to Game Development, along with artificial intelligence (AI), Cloud Computing, IoT, and let’s not forget SaaS, healthcare, and more. Arpit crafts content that’s as strategic as it is compelling. With a Logician's mind, he is always chasing sunrises and tech advancements while secretly preparing for the robot uprising.
OpenAI Is Building an Audio-First AI Model And It Wants to Put It in Your Pocket
New real-time audio model targeted for Q1 2026 alongside consumer device ambitions.
Nvidia in Advanced Talks to Acquire Israel's AI21 Labs for Up to $3 Billion
Deal would mark chipmaker's fourth major Israeli acquisition and signal shifting dynamics in enterprise AI.
Nvidia Finalizes $5 Billion Stake in Intel after FTC approval
The deal marks a significant lifeline for Intel and signals a new era of collaboration between two of America's most powerful chipmakers.
Manus Changed How AI Agents Work. Now It's Coming to 3 Billion Meta Users
The social media giant's purchase of the Singapore-based firm marks its third-largest acquisition ever, as the race for AI dominance intensifies.