Related papers: AI Agents as Universal Task Solvers

AI Agents as Universal Task Solvers

URL: http://arxiv.org/abs/2510.12066v1
Date: Tue, 14 Oct 2025 02:17:54 GMT
Title: AI Agents as Universal Task Solvers
Authors: Alessandro Achille, Stefano Soatto,
Abstract summary: We show that the optimal speed-up that a universal solver can achieve using past data is tightly related to their algorithmic information.<n>We argue that the key quantity to optimize when scaling reasoning models is time, whose critical role in learning has so far only been indirectly considered.
Score: 94.49762121230042
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: AI reasoning agents are already able to solve a variety of tasks by deploying tools, simulating outcomes of multiple hypotheses and reflecting on them. In doing so, they perform computation, although not in the classical sense -- there is no program being executed. Still, if they perform computation, can AI agents be universal? Can chain-of-thought reasoning solve any computable task? How does an AI Agent learn to reason? Is it a matter of model size? Or training dataset size? In this work, we reinterpret the role of learning in the context of AI Agents, viewing them as compute-capable stochastic dynamical systems, and highlight the role of time in a foundational principle for learning to reason. In doing so, we propose a shift from classical inductive learning to transductive learning -- where the objective is not to approximate the distribution of past data, but to capture their algorithmic structure to reduce the time needed to find solutions to new tasks. Transductive learning suggests that, counter to Shannon's theory, a key role of information in learning is about reduction of time rather than reconstruction error. In particular, we show that the optimal speed-up that a universal solver can achieve using past data is tightly related to their algorithmic information. Using this, we show a theoretical derivation for the observed power-law scaling of inference time versus training time. We then show that scaling model size can lead to behaviors that, while improving accuracy on benchmarks, fail any reasonable test of intelligence, let alone super-intelligence: In the limit of infinite space and time, large models can behave as savants, able to brute-force through any task without any insight. Instead, we argue that the key quantity to optimize when scaling reasoning models is time, whose critical role in learning has so far only been indirectly considered.

Related papers

Learning Without Training [0.0]
This dissertation focuses on three different projects rooted in mathematical theory for machine learning applications.<n>The first project deals with supervised learning and manifold learning.<n>The second project deals with transfer learning, which is the study of how an approximation process or model learned on one domain can be leveraged to improve the approximation on another domain.<n>The third project is concerned with the classification task in machine learning, particularly in the active learning paradigm.
arXiv Detail & Related papers (2026-02-20T04:42:06Z)
The Plausibility Trap: Using Probabilistic Engines for Deterministic Tasks [0.0]
This article defines the "Plausibility Trap"<n>Individuals with access to Artificial Intelligence deploy expensive probabilistic engines for simple deterministic tasks.<n>We introduce Tool Selection Engineering and the Deterministic-Probabilistic Decision Matrix to help developers determine when to use Generative AI.
arXiv Detail & Related papers (2026-01-21T16:05:01Z)
Beyond Memorization: Extending Reasoning Depth with Recurrence, Memory and Test-Time Compute Scaling [60.63703438729223]
We show how different architectures and training methods affect model multi-step reasoning capabilities.<n>We confirm that increasing model depth plays a crucial role for sequential computations.
arXiv Detail & Related papers (2025-08-22T18:57:08Z)
Agent RL Scaling Law: Agent RL with Spontaneous Code Execution for Mathematical Problem Solving [26.413753656936688]
Large Language Models (LLMs) often struggle with mathematical reasoning tasks requiring precise, verifiable computation.<n>While Reinforcement Learning (RL) from outcome-based rewards enhances text-based reasoning, understanding how agents autonomously learn to leverage external tools like code execution remains crucial.
arXiv Detail & Related papers (2025-05-12T17:23:34Z)
ATA: Adaptive Task Allocation for Efficient Resource Management in Distributed Machine Learning [54.08906841213777]
Asynchronous methods are fundamental for parallelizing computations in distributed machine learning.<n>We propose ATA (Adaptive Task Allocation), a method that adapts to heterogeneous and random distributions of computation times.<n>We show that ATA identifies the optimal task allocation and performs comparably to methods with prior knowledge of computation times.
arXiv Detail & Related papers (2025-02-02T12:22:26Z)
Do NOT Think That Much for 2+3=? On the Overthinking of o1-Like LLMs [76.43407125275202]
o1-like models can emulate human-like long-time thinking during inference.<n>This paper presents the first comprehensive study on the prevalent issue of overthinking in these models.<n>We propose strategies to mitigate overthinking, streamlining reasoning processes without compromising accuracy.
arXiv Detail & Related papers (2024-12-30T18:55:12Z)
Scaling Laws Beyond Backpropagation [64.0476282000118]
We study the ability of Direct Feedback Alignment to train causal decoder-only Transformers efficiently. We find that DFA fails to offer more efficient scaling than backpropagation.
arXiv Detail & Related papers (2022-10-26T10:09:14Z)
End-to-end Algorithm Synthesis with Recurrent Networks: Logical Extrapolation Without Overthinking [52.05847268235338]
We show how machine learning systems can perform logical extrapolation without overthinking problems. We propose a recall architecture that keeps an explicit copy of the problem instance in memory so that it cannot be forgotten. We also employ a progressive training routine that prevents the model from learning behaviors that are specific to number and instead pushes it to learn behaviors that can be repeated indefinitely.
arXiv Detail & Related papers (2022-02-11T18:43:28Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.