EASRec: Elastic Architecture Search for Efficient Long-term Sequential
Recommender Systems
- URL: http://arxiv.org/abs/2402.00390v1
- Date: Thu, 1 Feb 2024 07:22:52 GMT
- Title: EASRec: Elastic Architecture Search for Efficient Long-term Sequential
Recommender Systems
- Authors: Sheng Zhang, Maolin Wang, Yao Zhao, Chenyi Zhuang, Jinjie Gu, Ruocheng
Guo, Xiangyu Zhao, Zijian Zhang, Hongzhi Yin
- Abstract summary: Current Sequential Recommender Systems (SRSs) suffer from computational and resource inefficiencies.
We develop the Elastic Architecture Search for Efficient Long-term Sequential Recommender Systems (EASRec)
EASRec introduces data-aware gates that leverage historical information from input data batch to improve the performance of the recommendation network.
- Score: 82.76483989905961
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this age where data is abundant, the ability to distill meaningful
insights from the sea of information is essential. Our research addresses the
computational and resource inefficiencies that current Sequential Recommender
Systems (SRSs) suffer from. especially those employing attention-based models
like SASRec, These systems are designed for next-item recommendations in
various applications, from e-commerce to social networks. However, such systems
suffer from substantial computational costs and resource consumption during the
inference stage. To tackle these issues, our research proposes a novel method
that combines automatic pruning techniques with advanced model architectures.
We also explore the potential of resource-constrained Neural Architecture
Search (NAS), a technique prevalent in the realm of recommendation systems, to
fine-tune models for reduced FLOPs, latency, and energy usage while retaining
or even enhancing accuracy. The main contribution of our work is developing the
Elastic Architecture Search for Efficient Long-term Sequential Recommender
Systems (EASRec). This approach aims to find optimal compact architectures for
attention-based SRSs, ensuring accuracy retention. EASRec introduces data-aware
gates that leverage historical information from input data batch to improve the
performance of the recommendation network. Additionally, it utilizes a dynamic
resource constraint approach, which standardizes the search process and results
in more appropriate architectures. The effectiveness of our methodology is
validated through exhaustive experiments on three benchmark datasets, which
demonstrates EASRec's superiority in SRSs. Our research set a new standard for
future exploration into efficient and accurate recommender systems, signifying
a substantial advancement within this swiftly advancing field.
Related papers
- Query Optimization for Parametric Knowledge Refinement in Retrieval-Augmented Large Language Models [26.353428245346166]
The Extract-Refine-Retrieve-Read (ERRR) framework is designed to bridge the pre-retrieval information gap in Retrieval-Augmented Generation (RAG) systems.
Unlike conventional query optimization techniques used in RAG, the ERRR framework begins by extracting knowledge from Large Language Models (LLMs)
arXiv Detail & Related papers (2024-11-12T14:12:45Z) - Exploring Applications of State Space Models and Advanced Training Techniques in Sequential Recommendations: A Comparative Study on Efficiency and Performance [41.677784966514686]
This research focuses on three promising directions in sequential recommendations.
The first is to enhance speed through the use of State Space Models (SSM), as they can achieve SOTA results in the sequential recommendations domain with lower latency, memory, and inference costs.
The second is to improve the quality of recommendations with Large Language Models (LLMs), via Monolithic Preference Optimization without Reference Model (ORPO), and implementing adaptive batch- and step-size algorithms to reduce costs and accelerate training processes.
arXiv Detail & Related papers (2024-08-10T18:09:10Z) - A Study on the Implementation Method of an Agent-Based Advanced RAG System Using Graph [0.0]
This study implements an advanced RAG system based on Graph technology to develop high-quality generative AI services.
It employs LangGraph to evaluate the reliability of retrieved information and synthesizes diverse data to generate more accurate and enhanced responses.
arXiv Detail & Related papers (2024-07-29T13:26:43Z) - A Thorough Performance Benchmarking on Lightweight Embedding-based Recommender Systems [67.52782366565658]
State-of-the-art recommender systems (RSs) depend on categorical features, which ecoded by embedding vectors, resulting in excessively large embedding tables.
Despite the prosperity of lightweight embedding-based RSs, a wide diversity is seen in evaluation protocols.
This study investigates various LERS' performance, efficiency, and cross-task transferability via a thorough benchmarking process.
arXiv Detail & Related papers (2024-06-25T07:45:00Z) - Efficient Architecture Search via Bi-level Data Pruning [70.29970746807882]
This work pioneers an exploration into the critical role of dataset characteristics for DARTS bi-level optimization.
We introduce a new progressive data pruning strategy that utilizes supernet prediction dynamics as the metric.
Comprehensive evaluations on the NAS-Bench-201 search space, DARTS search space, and MobileNet-like search space validate that BDP reduces search costs by over 50%.
arXiv Detail & Related papers (2023-12-21T02:48:44Z) - Embedding in Recommender Systems: A Survey [67.67966158305603]
A crucial aspect is embedding techniques that covert the high-dimensional discrete features, such as user and item IDs, into low-dimensional continuous vectors.
Applying embedding techniques captures complex entity relationships and has spurred substantial research.
This survey covers embedding methods like collaborative filtering, self-supervised learning, and graph-based techniques.
arXiv Detail & Related papers (2023-10-28T06:31:06Z) - Modeling Time-Series and Spatial Data for Recommendations and Other
Applications [1.713291434132985]
We address the problems that may arise due to the poor quality of CTES data being fed into a recommender system.
To improve the quality of the CTES data, we address a fundamental problem of overcoming missing events in temporal sequences.
We extend their abilities to design solutions for large-scale CTES retrieval and human activity prediction.
arXiv Detail & Related papers (2022-12-25T09:34:15Z) - Learning Where To Look -- Generative NAS is Surprisingly Efficient [11.83842808044211]
We propose a generative model, paired with a surrogate predictor, that iteratively learns to generate samples from increasingly promising latent subspaces.
This approach leads to very effective and efficient architecture search, while keeping the query amount low.
arXiv Detail & Related papers (2022-03-16T16:27:11Z) - LoRD-Net: Unfolded Deep Detection Network with Low-Resolution Receivers [104.01415343139901]
We propose a deep detector entitled LoRD-Net for recovering information symbols from one-bit measurements.
LoRD-Net has a task-based architecture dedicated to recovering the underlying signal of interest.
We evaluate the proposed receiver architecture for one-bit signal recovery in wireless communications.
arXiv Detail & Related papers (2021-02-05T04:26:05Z) - Off-Policy Reinforcement Learning for Efficient and Effective GAN
Architecture Search [50.40004966087121]
We introduce a new reinforcement learning based neural architecture search (NAS) methodology for generative adversarial network (GAN) architecture search.
The key idea is to formulate the GAN architecture search problem as a Markov decision process (MDP) for smoother architecture sampling.
We exploit an off-policy GAN architecture search algorithm that makes efficient use of the samples generated by previous policies.
arXiv Detail & Related papers (2020-07-17T18:29:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.