Related papers: Pathway-based Progressive Inference (PaPI) for Energy-Efficient Continual Learning

Pathway-based Progressive Inference (PaPI) for Energy-Efficient Continual Learning

URL: http://arxiv.org/abs/2506.17848v1
Date: Sat, 21 Jun 2025 22:50:01 GMT
Title: Pathway-based Progressive Inference (PaPI) for Energy-Efficient Continual Learning
Authors: Suyash Gaurav, Jukka Heikkonen, Jatin Chaudhary,
Abstract summary: Continual learning systems face the dual challenge of preventing catastrophic forgetting while maintaining energy efficiency.<n>This paper introduces Pathway-based Progressive Inference (PaPI), a novel theoretical framework that addresses these challenges.<n>Our theoretical analysis shows that PaPI achieves an $mathcalO(K)$ improvement in the stability-plasticity trade-off.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Continual learning systems face the dual challenge of preventing catastrophic forgetting while maintaining energy efficiency, particularly in resource-constrained environments. This paper introduces Pathway-based Progressive Inference (PaPI), a novel theoretical framework that addresses these challenges through a mathematically rigorous approach to pathway selection and adaptation. We formulate continual learning as an energy-constrained optimization problem and provide formal convergence guarantees for our pathway routing mechanisms. Our theoretical analysis demonstrates that PaPI achieves an $\mathcal{O}(K)$ improvement in the stability-plasticity trade-off compared to monolithic architectures, where $K$ is the number of pathways. We derive tight bounds on forgetting rates using Fisher Information Matrix analysis and prove that PaPI's energy consumption scales with the number of active parameters rather than the total model size. Comparative theoretical analysis shows that PaPI provides stronger guarantees against catastrophic forgetting than Elastic Weight Consolidation (EWC) while maintaining better energy efficiency than both EWC and Gradient Episodic Memory (GEM). Our experimental validation confirms these theoretical advantages across multiple benchmarks, demonstrating PaPI's effectiveness for continual learning in energy-constrained settings. Our codes are available at https://github.com/zser092/PAPI_FILES.

Related papers

Tu(r)ning AI Green: Exploring Energy Efficiency Cascading with Orthogonal Optimizations [2.829284162137884]
This paper emphasizes on treating energy efficiency as the first-class citizen and as a fundamental design consideration for a compute-intensive pipeline.<n>We show that strategic selection across five AI pipeline phases (data, model, training, system, inference) creates cascading efficiency.<n>Combinations reduce energy consumption by up to $94.6$% while preserving $95.95$% of the original F1 score of non-optimized pipelines.
arXiv Detail & Related papers (2025-06-23T04:52:08Z)
Implicit Reward as the Bridge: A Unified View of SFT and DPO Connections [65.36449542323277]
We present a unified theoretical framework bridgingSupervised Fine-Tuning (SFT) and preference learning in Large Language Model (LLM) post-training.<n>We propose a simple yet effective learning rate reduction approach that yields significant performance improvements.
arXiv Detail & Related papers (2025-06-15T05:42:29Z)
Energy Considerations of Large Language Model Inference and Efficiency Optimizations [28.55549828393871]
As large language models (LLMs) scale in size and adoption, their computational and environmental costs continue to rise.<n>We systematically analyze the energy implications of common inference efficiency optimizations across diverse NLP and AI workloads.<n>Our findings reveal that the proper application of relevant inference efficiency optimizations can reduce total energy use by up to 73% from unoptimized baselines.
arXiv Detail & Related papers (2025-04-24T15:45:05Z)
EdgeRL: Reinforcement Learning-driven Deep Learning Model Inference Optimization at Edge [2.8946323553477704]
We propose EdgeRL framework that seeks to strike balance by using an Advantage Actor-Critic (A2C) Reinforcement Learning (RL) approach. We evaluate the benefits of EdgeRL framework in terms of end device energy savings, inference accuracy improvement, and end-to-end inference latency reduction.
arXiv Detail & Related papers (2024-10-16T04:31:39Z)
Federated Learning With Energy Harvesting Devices: An MDP Framework [17.87295300394514]
Federated learning (FL) requires that edge devices conduct local training and communicate with a parameter server, resulting in significant energy consumption.<n>A key challenge in practical FL systems is the rapid depletion of battery-limited edge devices, which limits their operational lifespan and impacts learning performance.<n>We implement energy harvesting techniques in FL systems to capture ambient energy, thereby providing continuous power to edge devices.
arXiv Detail & Related papers (2024-05-17T03:41:40Z)
A Comparative Study of Machine Learning Algorithms for Anomaly Detection in Industrial Environments: Performance and Environmental Impact [62.997667081978825]
This study seeks to address the demands of high-performance machine learning models with environmental sustainability. Traditional machine learning algorithms, such as Decision Trees and Random Forests, demonstrate robust efficiency and performance. However, superior outcomes were obtained with optimised configurations, albeit with a commensurate increase in resource consumption.
arXiv Detail & Related papers (2023-07-01T15:18:00Z)
c-TPE: Tree-structured Parzen Estimator with Inequality Constraints for Expensive Hyperparameter Optimization [45.67326752241075]
We propose constrained TPE (c-TPE), an extension of the widely-used versatile Bayesian optimization method, tree-structured Parzen estimator (TPE) to handle these constraints. Our proposed extension goes beyond a simple combination of an existing acquisition function and the original TPE, and instead includes modifications that address issues that cause poor performance. In the experiments, we demonstrate that c-TPE exhibits the best average rank performance among existing methods with statistical significance on 81 expensive HPO with inequality constraints.
arXiv Detail & Related papers (2022-11-26T00:25:11Z)
Scalable PAC-Bayesian Meta-Learning via the PAC-Optimal Hyper-Posterior: From Theory to Practice [54.03076395748459]
A central question in the meta-learning literature is how to regularize to ensure generalization to unseen tasks. We present a generalization bound for meta-learning, which was first derived by Rothfuss et al. We provide a theoretical analysis and empirical case study under which conditions and to what extent these guarantees for meta-learning improve upon PAC-Bayesian per-task learning bounds.
arXiv Detail & Related papers (2022-11-14T08:51:04Z)
Generalized Parametric Contrastive Learning [60.62901294843829]
Generalized Parametric Contrastive Learning (GPaCo/PaCo) works well on both imbalanced and balanced data. Experiments on long-tailed benchmarks manifest the new state-of-the-art for long-tailed recognition.
arXiv Detail & Related papers (2022-09-26T03:49:28Z)
A Theoretical Framework for Target Propagation [75.52598682467817]
We analyze target propagation (TP), a popular but not yet fully understood alternative to backpropagation (BP) Our theory shows that TP is closely related to Gauss-Newton optimization and thus substantially differs from BP. We provide a first solution to this problem through a novel reconstruction loss that improves feedback weight training.
arXiv Detail & Related papers (2020-06-25T12:07:06Z)
Targeted free energy estimation via learned mappings [66.20146549150475]
Free energy perturbation (FEP) was proposed by Zwanzig more than six decades ago as a method to estimate free energy differences. FEP suffers from a severe limitation: the requirement of sufficient overlap between distributions. One strategy to mitigate this problem, called Targeted Free Energy Perturbation, uses a high-dimensional mapping in configuration space to increase overlap.
arXiv Detail & Related papers (2020-02-12T11:10:00Z)

This list is automatically generated from the titles and abstracts of the papers in this site.