Balancing Sustainability And Performance: The Role Of Small-Scale Llms In Agentic Artificial Intelligence Systems
- URL: http://arxiv.org/abs/2601.19311v1
- Date: Tue, 27 Jan 2026 07:49:55 GMT
- Title: Balancing Sustainability And Performance: The Role Of Small-Scale Llms In Agentic Artificial Intelligence Systems
- Authors: Anh Khoa Ngo Ho, Martin Chauvin, Simon Gosset, Philippe Cordier, Boris Gamazaychikov,
- Abstract summary: This study investigates whether deploying smaller-scale language models can reduce energy consumption without compromising responsiveness and output quality.<n>Results show that smaller open-weights models can lower energy usage while preserving task quality.
- Score: 0.2796197251957245
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: As large language models become integral to agentic artificial intelligence systems, their energy demands during inference may pose significant sustainability challenges. This study investigates whether deploying smaller-scale language models can reduce energy consumption without compromising responsiveness and output quality in a multi-agent, real-world environments. We conduct a comparative analysis across language models of varying scales to quantify trade-offs between efficiency and performance. Results show that smaller open-weights models can lower energy usage while preserving task quality. Building on these findings, we propose practical guidelines for sustainable artificial intelligence design, including optimal batch size configuration and computation resource allocation. These insights offer actionable strategies for developing scalable, environmentally responsible artificial intelligence systems.
Related papers
- From Word to World: Can Large Language Models be Implicit Text-based World Models? [82.47317196099907]
Agentic reinforcement learning increasingly relies on experience-driven scaling.<n>World models offer a potential way to improve learning efficiency through simulated experience.<n>We study whether large language models can reliably serve this role and under what conditions they meaningfully benefit agents.
arXiv Detail & Related papers (2025-12-21T17:28:42Z) - AgentEvolver: Towards Efficient Self-Evolving Agent System [51.54882384204726]
We present AgentEvolver, a self-evolving agent system that drives autonomous agent learning.<n>AgentEvolver introduces three synergistic mechanisms: self-questioning, self-navigating, and self-attributing.<n>Preliminary experiments indicate that AgentEvolver achieves more efficient exploration, better sample utilization, and faster adaptation compared to traditional RL-based baselines.
arXiv Detail & Related papers (2025-11-13T15:14:47Z) - Information Capacity: Evaluating the Efficiency of Large Language Models via Text Compression [53.39128997308138]
We introduce information capacity, a measure of model efficiency based on text compression performance.<n> Empirical evaluations on mainstream open-source models show that models of varying sizes within a series exhibit consistent information capacity.<n>A distinctive feature of information capacity is that it incorporates tokenizer efficiency, which affects both input and output token counts.
arXiv Detail & Related papers (2025-11-11T10:07:32Z) - Comparing energy consumption and accuracy in text classification inference [0.9208007322096533]
This study systematically evaluates the trade-offs between model accuracy and energy consumption in text classification inference.<n>The best-performing model in terms of accuracy can also be energy-efficient, while larger LLMs tend to consume significantly more energy with lower classification accuracy.
arXiv Detail & Related papers (2025-08-19T18:00:08Z) - The Cost of Dynamic Reasoning: Demystifying AI Agents and Test-Time Scaling from an AI Infrastructure Perspective [3.0868637098088403]
Large-language-model (LLM)-based AI agents have recently showcased impressive versatility by employing dynamic reasoning.<n>This paper presents the first comprehensive system-level analysis of AI agents, quantifying their resource usage, latency behavior, energy consumption, and test-time scaling strategies.<n>Our findings reveal that while agents improve accuracy with increased compute, they suffer from rapidly diminishing returns, widening latency variance, and unsustainable infrastructure costs.
arXiv Detail & Related papers (2025-06-04T14:37:54Z) - Systematic Weight Evaluation for Pruning Large Language Models: Enhancing Performance and Sustainability [1.542607498220242]
This research focuses on the systematic evaluation of individual weight importance throughout the training process.<n>We propose a method that effectively reduces model size without compromising performance.<n>These findings highlight the critical need for optimized AI models to ensure sustainable development.
arXiv Detail & Related papers (2025-02-24T11:34:49Z) - Addressing the sustainable AI trilemma: a case study on LLM agents and RAG [7.6212949300713015]
Large language models (LLMs) have demonstrated significant capabilities, but their widespread deployment and more advanced applications raise critical sustainability challenges.<n>We propose the concept of the Sustainable AI Trilemma, highlighting the tensions between AI capability, digital equity, and environmental sustainability.
arXiv Detail & Related papers (2025-01-14T17:21:16Z) - On the Modeling Capabilities of Large Language Models for Sequential Decision Making [52.128546842746246]
Large pretrained models are showing increasingly better performance in reasoning and planning tasks.
We evaluate their ability to produce decision-making policies, either directly, by generating actions, or indirectly.
In environments with unfamiliar dynamics, we explore how fine-tuning LLMs with synthetic data can significantly improve their reward modeling capabilities.
arXiv Detail & Related papers (2024-10-08T03:12:57Z) - Entropy-Regularized Token-Level Policy Optimization for Language Agent Reinforcement [67.1393112206885]
Large Language Models (LLMs) have shown promise as intelligent agents in interactive decision-making tasks.
We introduce Entropy-Regularized Token-level Policy Optimization (ETPO), an entropy-augmented RL method tailored for optimizing LLMs at the token level.
We assess the effectiveness of ETPO within a simulated environment that models data science code generation as a series of multi-step interactive tasks.
arXiv Detail & Related papers (2024-02-09T07:45:26Z) - Energy-frugal and Interpretable AI Hardware Design using Learning
Automata [5.514795777097036]
A new machine learning algorithm, called the Tsetlin machine, has been proposed.
In this paper, we investigate methods of energy-frugal artificial intelligence hardware design.
We show that frugal resource allocation can provide decisive energy reduction while also achieving robust and interpretable learning.
arXiv Detail & Related papers (2023-05-19T15:11:18Z) - Online reinforcement learning with sparse rewards through an active
inference capsule [62.997667081978825]
This paper introduces an active inference agent which minimizes the novel free energy of the expected future.
Our model is capable of solving sparse-reward problems with a very high sample efficiency.
We also introduce a novel method for approximating the prior model from the reward function, which simplifies the expression of complex objectives.
arXiv Detail & Related papers (2021-06-04T10:03:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.