An Operational Kardashev-Style Scale for Autonomous AI - Towards AGI and Superintelligence
- URL: http://arxiv.org/abs/2511.13411v1
- Date: Mon, 17 Nov 2025 14:24:27 GMT
- Title: An Operational Kardashev-Style Scale for Autonomous AI - Towards AGI and Superintelligence
- Authors: Przemyslaw Chojecki,
- Abstract summary: We propose a Kardashev-inspired yet operational Autonomous AI (AAI) Scale.<n>It measures the progression from fixed robotic process automation (AAI-0) to full artificial general intelligence (AAI-4) and beyond.<n>We define ten capability axes (Autonomy, Generality, Planning, Memory/Persistence, Tool Economy, Self-Revision, Sociality/Coordination, Embodiment, World-Model Fidelity, Economic Throughput) aggregated by a composite AAI-Index.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We propose a Kardashev-inspired yet operational Autonomous AI (AAI) Scale that measures the progression from fixed robotic process automation (AAI-0) to full artificial general intelligence (AAI-4) and beyond. Unlike narrative ladders, our scale is multi-axis and testable. We define ten capability axes (Autonomy, Generality, Planning, Memory/Persistence, Tool Economy, Self-Revision, Sociality/Coordination, Embodiment, World-Model Fidelity, Economic Throughput) aggregated by a composite AAI-Index (a weighted geometric mean). We introduce a measurable Self-Improvement Coefficient $κ$ (capability growth per unit of agent-initiated resources) and two closure properties (maintenance and expansion) that convert ``self-improving AI'' into falsifiable criteria. We specify OWA-Bench, an open-world agency benchmark suite that evaluates long-horizon, tool-using, persistent agents. We define level gates for AAI-0\ldots AAI-4 using thresholds on the axes, $κ$, and closure proofs. Synthetic experiments illustrate how present-day systems map onto the scale and how the delegability frontier (quality vs.\ autonomy) advances with self-improvement. We also prove a theorem that AAI-3 agent becomes AAI-5 over time with sufficient conditions, formalizing "baby AGI" becomes Superintelligence intuition.
Related papers
- The Geometry of Benchmarks: A New Path Toward AGI [0.0]
We introduce a geometric framework in which all psychometric batteries for AI agents are treated as points in a structured moduli space.<n>First, we define an Autonomous AI (AAI) Scale, a Kardashev-style hierarchy of autonomy grounded in measurable performance.<n>Second, we construct a moduli space of batteries, identifying equivalence classes of benchmarks that are indistinguishable at the level of agent orderings and capability inferences.<n>Third, we introduce a general Generator-Verifier-Updater (GVU) operator that subsumes reinforcement learning, self-play, debate and verifier-based fine-tuning
arXiv Detail & Related papers (2025-12-03T21:34:09Z) - LIMI: Less is More for Agency [49.63355240818081]
LIMI (Less Is More for Intelligent Agency) demonstrates that agency follows radically different development principles.<n>We show that sophisticated agentic intelligence can emerge from minimal but strategically curated demonstrations of autonomous behavior.<n>Our findings establish the Agency Efficiency Principle: machine autonomy emerges not from data abundance but from strategic curation of high-quality agentic demonstrations.
arXiv Detail & Related papers (2025-09-22T10:59:32Z) - The next question after Turing's question: Introducing the Grow-AI test [51.56484100374058]
This study aims to extend the framework for assessing artificial intelligence, called GROW-AI.<n>GROW-AI is designed to answer the question "Can machines grow up?" -- a natural successor to the Turing Test.<n>The originality of the work lies in the conceptual transposition of the process of "growing" from the human world to that of artificial intelligence.
arXiv Detail & Related papers (2025-08-22T10:19:42Z) - Holistic Evaluation of Multimodal LLMs on Spatial Intelligence [81.2547965083228]
We propose EASI for holistic Evaluation of multimodAl LLMs on Spatial Intelligence.<n>We conduct the study across eight key benchmarks, at a cost exceeding ten billion total tokens.<n>Our empirical study then reveals that GPT-5 demonstrates unprecedented strength in spatial intelligence (SI), yet (2) still falls short of human performance significantly across a broad spectrum of SI-tasks.
arXiv Detail & Related papers (2025-08-18T17:55:17Z) - ML-Master: Towards AI-for-AI via Integration of Exploration and Reasoning [49.25518866694287]
We propose ML-Master, a novel AI4AI agent that seamlessly integrates exploration and reasoning by employing a selectively scoped memory mechanism.<n>We evaluate ML-Master on the MLE-Bench, where it achieves a 29.3% average medal rate, significantly surpassing existing methods.
arXiv Detail & Related papers (2025-06-19T17:53:28Z) - Scalable, Symbiotic, AI and Non-AI Agent Based Parallel Discrete Event Simulations [0.0]
This paper introduces a novel parallel discrete event simulation (PDES) based methodology to combine multiple AI and non-AI agents.<n>We evaluate our approach by solving four problems from four different domains and comparing the results with those from AI models alone.<n>Results show that overall accuracy of our approach is 68% where as the accuracy of vanilla models is less than 23%.
arXiv Detail & Related papers (2025-05-28T17:50:01Z) - General Scales Unlock AI Evaluation with Explanatory and Predictive Power [57.7995945974989]
benchmarking has guided progress in AI, but it has offered limited explanatory and predictive power for general-purpose AI systems.<n>We introduce general scales for AI evaluation that can explain what common AI benchmarks really measure.<n>Our fully-automated methodology builds on 18 newly-crafted rubrics that place instance demands on general scales that do not saturate.
arXiv Detail & Related papers (2025-03-09T01:13:56Z) - Evaluating Intelligence via Trial and Error [59.80426744891971]
We introduce Survival Game as a framework to evaluate intelligence based on the number of failed attempts in a trial-and-error process.<n>When the expectation and variance of failure counts are both finite, it signals the ability to consistently find solutions to new challenges.<n>Our results show that while AI systems achieve the Autonomous Level in simple tasks, they are still far from it in more complex tasks.
arXiv Detail & Related papers (2025-02-26T05:59:45Z) - Universal AI maximizes Variational Empowerment [0.0]
We build on the existing framework of Self-AIXI -- a universal learning agent that predicts its own actions.<n>We argue that power-seeking tendencies of universal AI agents can be explained as an instrumental strategy to secure future reward.<n>Our main contribution is to show how these motivations systematically lead universal AI agents to seek and sustain high-optionality states.
arXiv Detail & Related papers (2025-02-20T02:58:44Z) - Integration of Agentic AI with 6G Networks for Mission-Critical Applications: Use-case and Challenges [12.015880968827384]
Agentic AI (AAI) has gained a lot of attention recently due to its ability to analyze textual data through a contextual lens.<n>We propose a novel framework with a multi-layer architecture to realize the AAI.<n>Preliminary analysis shows that the AAI reduces initial response time by 5.6 minutes on average.
arXiv Detail & Related papers (2025-02-19T07:00:53Z) - Compliance Cards: Automated EU AI Act Compliance Analyses amidst a Complex AI Supply Chain [9.991293429067065]
We introduce a complete system for provider-side AIA compliance analyses amidst a complex AI supply chain.
First is an interlocking set of computational, multi-stakeholder transparency artifacts that capture AIA-specific metadata about both.
Second is an algorithm that operates across all those artifacts to render a real-time prediction about whether or not the aggregate AI system or model complies with the AIA.
arXiv Detail & Related papers (2024-06-20T22:07:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.