Related papers: Green Recommender Systems: Optimizing Dataset Size for Energy-Efficient Algorithm Performance

Green Recommender Systems: Optimizing Dataset Size for Energy-Efficient Algorithm Performance

URL: http://arxiv.org/abs/2410.09359v2
Date: Tue, 5 Nov 2024 03:45:24 GMT
Title: Green Recommender Systems: Optimizing Dataset Size for Energy-Efficient Algorithm Performance
Authors: Ardalan Arabzadeh, Tobias Vente, Joeran Beel,
Abstract summary: This paper investigates the potential for energy-efficient algorithm performance by optimizing dataset sizes. We conducted experiments on the MovieLens 100K, 1M, 10M, and Amazon Toys and Games datasets.
Score: 0.10241134756773229
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: As recommender systems become increasingly prevalent, the environmental impact and energy efficiency of training large-scale models have come under scrutiny. This paper investigates the potential for energy-efficient algorithm performance by optimizing dataset sizes through downsampling techniques in the context of Green Recommender Systems. We conducted experiments on the MovieLens 100K, 1M, 10M, and Amazon Toys and Games datasets, analyzing the performance of various recommender algorithms under different portions of dataset size. Our results indicate that while more training data generally leads to higher algorithm performance, certain algorithms, such as FunkSVD and BiasedMF, particularly with unbalanced and sparse datasets like Amazon Toys and Games, maintain high-quality recommendations with up to a 50% reduction in training data, achieving nDCG@10 scores within approximately 13% of full dataset performance. These findings suggest that strategic dataset reduction can decrease computational and environmental costs without substantially compromising recommendation quality. This study advances sustainable and green recommender systems by providing insights for reducing energy consumption while maintaining effectiveness.

Related papers

Optimal Dataset Size for Recommender Systems: Evaluating Algorithms' Performance via Downsampling [0.0]
This thesis investigates dataset downsampling as a strategy to optimize energy efficiency in recommender systems. By applying two downsampling approaches to seven datasets, 12 algorithms, and two levels of core pruning, the research demonstrates significant reductions in runtime and carbon emissions.
arXiv Detail & Related papers (2025-02-12T23:32:09Z)
Adaptive Data Exploitation in Deep Reinforcement Learning [50.53705050673944]
We introduce ADEPT, a powerful framework to enhance the **data efficiency** and **generalization** in deep reinforcement learning (RL) Specifically, ADEPT adaptively manages the use of sampled data across different learning stages via multi-armed bandit (MAB) algorithms. We test ADEPT on benchmarks including Procgen, MiniGrid, and PyBullet.
arXiv Detail & Related papers (2025-01-22T04:01:17Z)
Reward-Augmented Data Enhances Direct Preference Alignment of LLMs [56.24431208419858]
We introduce reward-conditioned Large Language Models (LLMs) that learn from the entire spectrum of response quality within the dataset. We propose an effective yet simple data relabeling method that conditions the preference pairs on quality scores to construct a reward-augmented dataset.
arXiv Detail & Related papers (2024-10-10T16:01:51Z)
Revisiting BPR: A Replicability Study of a Common Recommender System Baseline [78.00363373925758]
We study the features of the BPR model, indicating their impact on its performance, and investigate open-source BPR implementations. Our analysis reveals inconsistencies between these implementations and the original BPR paper, leading to a significant decrease in performance of up to 50% for specific implementations. We show that the BPR model can achieve performance levels close to state-of-the-art methods on the top-n recommendation tasks and even outperform them on specific datasets.
arXiv Detail & Related papers (2024-09-21T18:39:53Z)
An In-Depth Analysis of Data Reduction Methods for Sustainable Deep Learning [0.15833270109954137]
We present up to eight different methods to reduce the size of a training dataset. We also develop a Python package to apply them. We experimentally compare how these data reduction methods affect the representativeness of the reduced dataset.
arXiv Detail & Related papers (2024-03-22T12:06:40Z)
EASRec: Elastic Architecture Search for Efficient Long-term Sequential Recommender Systems [82.76483989905961]
Current Sequential Recommender Systems (SRSs) suffer from computational and resource inefficiencies. We develop the Elastic Architecture Search for Efficient Long-term Sequential Recommender Systems (EASRec) EASRec introduces data-aware gates that leverage historical information from input data batch to improve the performance of the recommendation network.
arXiv Detail & Related papers (2024-02-01T07:22:52Z)
Adaptive Resource Allocation for Virtualized Base Stations in O-RAN with Online Learning [60.17407932691429]
Open Radio Access Network systems, with their base stations (vBSs), offer operators the benefits of increased flexibility, reduced costs, vendor diversity, and interoperability. We propose an online learning algorithm that balances the effective throughput and vBS energy consumption, even under unforeseeable and "challenging'' environments. We prove the proposed solutions achieve sub-linear regret, providing zero average optimality gap even in challenging environments.
arXiv Detail & Related papers (2023-09-04T17:30:21Z)
Improved Distribution Matching for Dataset Condensation [91.55972945798531]
We propose a novel dataset condensation method based on distribution matching. Our simple yet effective method outperforms most previous optimization-oriented methods with much fewer computational resources.
arXiv Detail & Related papers (2023-07-19T04:07:33Z)
Stochastic Re-weighted Gradient Descent via Distributionally Robust Optimization [14.23697277904244]
We present Reweighted Gradient Descent (RGD), a novel optimization technique that improves the performance of deep neural networks through dynamic sample re-weighting. We demonstrate the effectiveness of RGD on various learning tasks, including supervised learning, meta-learning, and out-of-domain generalization.
arXiv Detail & Related papers (2023-06-15T15:58:04Z)
CAFE: Learning to Condense Dataset by Aligning Features [72.99394941348757]
We propose a novel scheme to Condense dataset by Aligning FEatures (CAFE) At the heart of our approach is an effective strategy to align features from the real and synthetic data across various scales. We validate the proposed CAFE across various datasets, and demonstrate that it generally outperforms the state of the art.
arXiv Detail & Related papers (2022-03-03T05:58:49Z)
Balancing Performance and Energy Consumption of Bagging Ensembles for the Classification of Data Streams in Edge Computing [9.801387036837871]
Edge Computing (EC) has emerged as an enabling factor for developing technologies like the Internet of Things (IoT) and 5G networks. This work investigates strategies for optimizing the performance and energy consumption of bagging ensembles to classify data streams.
arXiv Detail & Related papers (2022-01-17T04:12:18Z)
Fine-Grained Data Selection for Improved Energy Efficiency of Federated Edge Learning [2.924868086534434]
In Federated edge learning (FEEL), energy-constrained devices at the network edge consume significant energy when training and uploading their local machine learning models. This work proposes novel solutions for energy-efficient FEEL by jointly considering local training data, available computation, and communications resources.
arXiv Detail & Related papers (2021-06-20T10:51:32Z)
SASL: Saliency-Adaptive Sparsity Learning for Neural Network Acceleration [20.92912642901645]
We propose a Saliency-Adaptive Sparsity Learning (SASL) approach for further optimization. Our method can reduce 49.7% FLOPs of ResNet-50 with very negligible 0.39% top-1 and 0.05% top-5 accuracy degradation.
arXiv Detail & Related papers (2020-03-12T16:49:37Z)
Adversarial Filters of Dataset Biases [96.090959788952]
Large neural models have demonstrated human-level performance on language and vision benchmarks. Their performance degrades considerably on adversarial or out-of-distribution samples. We propose AFLite, which adversarially filters such dataset biases.
arXiv Detail & Related papers (2020-02-10T21:59:21Z)

This list is automatically generated from the titles and abstracts of the papers in this site.