Energy-Aware LLMs: A step towards sustainable AI for downstream applications
- URL: http://arxiv.org/abs/2503.17783v1
- Date: Sat, 22 Mar 2025 14:28:29 GMT
- Title: Energy-Aware LLMs: A step towards sustainable AI for downstream applications
- Authors: Nguyen Phuc Tran, Brigitte Jaumard, Oscar Delgado,
- Abstract summary: Advanced Large Language Models (LLMs) have revolutionized various fields, including communication networks.<n>LLMs typically require huge computational resources, resulting in terribly high energy consumption.<n>This research study proposes an end-to-end pipeline that investigates the trade-off between energy efficiency and model performance.
- Score: 0.9012198585960441
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Advanced Large Language Models (LLMs) have revolutionized various fields, including communication networks, sparking an innovation wave that has led to new applications and services, and significantly enhanced solution schemes. Despite all these impressive developments, most LLMs typically require huge computational resources, resulting in terribly high energy consumption. Thus, this research study proposes an end-to-end pipeline that investigates the trade-off between energy efficiency and model performance for an LLM during fault ticket analysis in communication networks. It further evaluates the pipeline performance using two real-world datasets for the tasks of root cause analysis and response feedback in a communication network. Our results show that an appropriate combination of quantization and pruning techniques is able to reduce energy consumption while significantly improving model performance.
Related papers
- Energy Considerations of Large Language Model Inference and Efficiency Optimizations [28.55549828393871]
As large language models (LLMs) scale in size and adoption, their computational and environmental costs continue to rise.
We systematically analyze the energy implications of common inference efficiency optimizations across diverse NLP and AI workloads.
Our findings reveal that the proper application of relevant inference efficiency optimizations can reduce total energy use by up to 73% from unoptimized baselines.
arXiv Detail & Related papers (2025-04-24T15:45:05Z) - DeepSeek-Inspired Exploration of RL-based LLMs and Synergy with Wireless Networks: A Survey [62.697565282841026]
Reinforcement learning (RL)-based large language models (LLMs) have gained significant attention.
Wireless networks require the empowerment of RL-based LLMs.
Wireless networks provide a vital infrastructure for the efficient training, deployment, and distributed inference of RL-based LLMs.
arXiv Detail & Related papers (2025-03-13T01:59:11Z) - DSMoE: Matrix-Partitioned Experts with Dynamic Routing for Computation-Efficient Dense LLMs [70.91804882618243]
This paper proposes DSMoE, a novel approach that achieves sparsification by partitioning pre-trained FFN layers into computational blocks.<n>We implement adaptive expert routing using sigmoid activation and straight-through estimators, enabling tokens to flexibly access different aspects of model knowledge.<n>Experiments on LLaMA models demonstrate that under equivalent computational constraints, DSMoE achieves superior performance compared to existing pruning and MoE approaches.
arXiv Detail & Related papers (2025-02-18T02:37:26Z) - Energy-Efficient Flying LoRa Gateways: A Multi-Agent Reinforcement Learning Approach [8.65375886947338]
We deploy flying LoRa gateways mounted on unmanned aerial vehicles (UAVs) to collect data from LoRa end devices and transmit it to a central server.
Our primary objective is to maximize the global system energy efficiency of wireless LoRa networks by joint optimization of transmission power, spreading factor, bandwidth, and user association.
arXiv Detail & Related papers (2025-02-05T17:16:40Z) - Empowering Large Language Models in Wireless Communication: A Novel Dataset and Fine-Tuning Framework [81.29965270493238]
We develop a specialized dataset aimed at enhancing the evaluation and fine-tuning of large language models (LLMs) for wireless communication applications.<n>The dataset includes a diverse set of multi-hop questions, including true/false and multiple-choice types, spanning varying difficulty levels from easy to hard.<n>We introduce a Pointwise V-Information (PVI) based fine-tuning method, providing a detailed theoretical analysis and justification for its use in quantifying the information content of training data.
arXiv Detail & Related papers (2025-01-16T16:19:53Z) - Investigating Energy Efficiency and Performance Trade-offs in LLM Inference Across Tasks and DVFS Settings [1.5749416770494706]
Large language models (LLMs) have shown significant improvements in many natural language processing (NLP) tasks.<n>LLMs are resource-intensive, requiring extensive computational resources both during training and inference.<n>As their adoption accelerates, the sustainability of LLMs has become a critical issue.
arXiv Detail & Related papers (2025-01-14T16:02:33Z) - The Price of Prompting: Profiling Energy Use in Large Language Models Inference [5.254805405012678]
This paper introduces MELODI, a framework crafted to monitor and analyze the energy consumed during large language models inference processes.
The dataset, generated using MELODI, encompasses a broad spectrum of LLM deployment frameworks, multiple language models, and extensive prompt datasets.
Our findings indicate substantial disparities in energy efficiency, suggesting ample scope for optimization and adoption of sustainable measures.
arXiv Detail & Related papers (2024-07-04T12:16:28Z) - Hybrid Heterogeneous Clusters Can Lower the Energy Consumption of LLM Inference Workloads [0.2389598109913753]
Training and using Large Language Models (LLMs) require large amounts of energy.
This paper addresses the challenge of reducing energy consumption in data centers running LLMs.
We propose a hybrid data center model that uses a cost-based scheduling framework to dynamically allocate tasks across hardware accelerators.
arXiv Detail & Related papers (2024-04-25T11:24:08Z) - Characterization of Large Language Model Development in the Datacenter [55.9909258342639]
Large Language Models (LLMs) have presented impressive performance across several transformative tasks.
However, it is non-trivial to efficiently utilize large-scale cluster resources to develop LLMs.
We present an in-depth characterization study of a six-month LLM development workload trace collected from our GPU datacenter Acme.
arXiv Detail & Related papers (2024-03-12T13:31:14Z) - Data-driven Energy Efficiency Modelling in Large-scale Networks: An Expert Knowledge and ML-based Approach [8.326834499339107]
This paper introduces the simulated reality of communication networks (SRCON) framework.
It harnesses live network data and employs a blend of machine learning (ML)- and expert-based models.
Results show significant gains over a state-of-the art method used by a operator for network energy efficiency modeling.
arXiv Detail & Related papers (2023-12-31T10:03:08Z) - Multiagent Reinforcement Learning with an Attention Mechanism for
Improving Energy Efficiency in LoRa Networks [52.96907334080273]
As the network scale increases, the energy efficiency of LoRa networks decreases sharply due to severe packet collisions.
We propose a transmission parameter allocation algorithm based on multiagent reinforcement learning (MALoRa)
Simulation results demonstrate that MALoRa significantly improves the system EE compared with baseline algorithms.
arXiv Detail & Related papers (2023-09-16T11:37:23Z) - Federated Learning over Wireless IoT Networks with Optimized
Communication and Resources [98.18365881575805]
Federated learning (FL) as a paradigm of collaborative learning techniques has obtained increasing research attention.
It is of interest to investigate fast responding and accurate FL schemes over wireless systems.
We show that the proposed communication-efficient federated learning framework converges at a strong linear rate.
arXiv Detail & Related papers (2021-10-22T13:25:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.