Related papers: An Operational Kardashev-Style Scale for Autonomous AI

An Operational Kardashev-Style Scale for Autonomous AI - Towards AGI and Superintelligence

URL: http://arxiv.org/abs/2511.13411v1
Date: Mon, 17 Nov 2025 14:24:27 GMT
Title: An Operational Kardashev-Style Scale for Autonomous AI - Towards AGI and Superintelligence
Authors: Przemyslaw Chojecki,
Abstract summary: We propose a Kardashev-inspired yet operational Autonomous AI (AAI) Scale.<n>It measures the progression from fixed robotic process automation (AAI-0) to full artificial general intelligence (AAI-4) and beyond.<n>We define ten capability axes (Autonomy, Generality, Planning, Memory/Persistence, Tool Economy, Self-Revision, Sociality/Coordination, Embodiment, World-Model Fidelity, Economic Throughput) aggregated by a composite AAI-Index.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We propose a Kardashev-inspired yet operational Autonomous AI (AAI) Scale that measures the progression from fixed robotic process automation (AAI-0) to full artificial general intelligence (AAI-4) and beyond. Unlike narrative ladders, our scale is multi-axis and testable. We define ten capability axes (Autonomy, Generality, Planning, Memory/Persistence, Tool Economy, Self-Revision, Sociality/Coordination, Embodiment, World-Model Fidelity, Economic Throughput) aggregated by a composite AAI-Index (a weighted geometric mean). We introduce a measurable Self-Improvement Coefficient $κ$ (capability growth per unit of agent-initiated resources) and two closure properties (maintenance and expansion) that convert ``self-improving AI'' into falsifiable criteria. We specify OWA-Bench, an open-world agency benchmark suite that evaluates long-horizon, tool-using, persistent agents. We define level gates for AAI-0\ldots AAI-4 using thresholds on the axes, $κ$, and closure proofs. Synthetic experiments illustrate how present-day systems map onto the scale and how the delegability frontier (quality vs.\ autonomy) advances with self-improvement. We also prove a theorem that AAI-3 agent becomes AAI-5 over time with sufficient conditions, formalizing "baby AGI" becomes Superintelligence intuition.

Related papers

The Geometry of Benchmarks: A New Path Toward AGI [0.0]
We introduce a geometric framework in which all psychometric batteries for AI agents are treated as points in a structured moduli space.<n>First, we define an Autonomous AI (AAI) Scale, a Kardashev-style hierarchy of autonomy grounded in measurable performance.<n>Second, we construct a moduli space of batteries, identifying equivalence classes of benchmarks that are indistinguishable at the level of agent orderings and capability inferences.<n>Third, we introduce a general Generator-Verifier-Updater (GVU) operator that subsumes reinforcement learning, self-play, debate and verifier-based fine-tuning
arXiv Detail & Related papers (2025-12-03T21:34:09Z)
LIMI: Less is More for Agency [49.63355240818081]
LIMI (Less Is More for Intelligent Agency) demonstrates that agency follows radically different development principles.<n>We show that sophisticated agentic intelligence can emerge from minimal but strategically curated demonstrations of autonomous behavior.<n>Our findings establish the Agency Efficiency Principle: machine autonomy emerges not from data abundance but from strategic curation of high-quality agentic demonstrations.
arXiv Detail & Related papers (2025-09-22T10:59:32Z)
The next question after Turing's question: Introducing the Grow-AI test [51.56484100374058]
This study aims to extend the framework for assessing artificial intelligence, called GROW-AI.<n>GROW-AI is designed to answer the question "Can machines grow up?" -- a natural successor to the Turing Test.<n>The originality of the work lies in the conceptual transposition of the process of "growing" from the human world to that of artificial intelligence.
arXiv Detail & Related papers (2025-08-22T10:19:42Z)
Holistic Evaluation of Multimodal LLMs on Spatial Intelligence [81.2547965083228]
We propose EASI for holistic Evaluation of multimodAl LLMs on Spatial Intelligence.<n>We conduct the study across eight key benchmarks, at a cost exceeding ten billion total tokens.<n>Our empirical study then reveals that GPT-5 demonstrates unprecedented strength in spatial intelligence (SI), yet (2) still falls short of human performance significantly across a broad spectrum of SI-tasks.
arXiv Detail & Related papers (2025-08-18T17:55:17Z)
ML-Master: Towards AI-for-AI via Integration of Exploration and Reasoning [49.25518866694287]
We propose ML-Master, a novel AI4AI agent that seamlessly integrates exploration and reasoning by employing a selectively scoped memory mechanism.<n>We evaluate ML-Master on the MLE-Bench, where it achieves a 29.3% average medal rate, significantly surpassing existing methods.
arXiv Detail & Related papers (2025-06-19T17:53:28Z)
Scalable, Symbiotic, AI and Non-AI Agent Based Parallel Discrete Event Simulations [0.0]
This paper introduces a novel parallel discrete event simulation (PDES) based methodology to combine multiple AI and non-AI agents.<n>We evaluate our approach by solving four problems from four different domains and comparing the results with those from AI models alone.<n>Results show that overall accuracy of our approach is 68% where as the accuracy of vanilla models is less than 23%.
arXiv Detail & Related papers (2025-05-28T17:50:01Z)
General Scales Unlock AI Evaluation with Explanatory and Predictive Power [57.7995945974989]
benchmarking has guided progress in AI, but it has offered limited explanatory and predictive power for general-purpose AI systems.<n>We introduce general scales for AI evaluation that can explain what common AI benchmarks really measure.<n>Our fully-automated methodology builds on 18 newly-crafted rubrics that place instance demands on general scales that do not saturate.
arXiv Detail & Related papers (2025-03-09T01:13:56Z)
Evaluating Intelligence via Trial and Error [59.80426744891971]
We introduce Survival Game as a framework to evaluate intelligence based on the number of failed attempts in a trial-and-error process.<n>When the expectation and variance of failure counts are both finite, it signals the ability to consistently find solutions to new challenges.<n>Our results show that while AI systems achieve the Autonomous Level in simple tasks, they are still far from it in more complex tasks.
arXiv Detail & Related papers (2025-02-26T05:59:45Z)
Universal AI maximizes Variational Empowerment [0.0]
We build on the existing framework of Self-AIXI -- a universal learning agent that predicts its own actions.<n>We argue that power-seeking tendencies of universal AI agents can be explained as an instrumental strategy to secure future reward.<n>Our main contribution is to show how these motivations systematically lead universal AI agents to seek and sustain high-optionality states.
arXiv Detail & Related papers (2025-02-20T02:58:44Z)
Integration of Agentic AI with 6G Networks for Mission-Critical Applications: Use-case and Challenges [12.015880968827384]
Agentic AI (AAI) has gained a lot of attention recently due to its ability to analyze textual data through a contextual lens.<n>We propose a novel framework with a multi-layer architecture to realize the AAI.<n>Preliminary analysis shows that the AAI reduces initial response time by 5.6 minutes on average.
arXiv Detail & Related papers (2025-02-19T07:00:53Z)
Compliance Cards: Automated EU AI Act Compliance Analyses amidst a Complex AI Supply Chain [9.991293429067065]
We introduce a complete system for provider-side AIA compliance analyses amidst a complex AI supply chain. First is an interlocking set of computational, multi-stakeholder transparency artifacts that capture AIA-specific metadata about both. Second is an algorithm that operates across all those artifacts to render a real-time prediction about whether or not the aggregate AI system or model complies with the AIA.
arXiv Detail & Related papers (2024-06-20T22:07:15Z)

This list is automatically generated from the titles and abstracts of the papers in this site.