A Thorough Performance Benchmarking on Lightweight Embedding-based Recommender Systems
- URL: http://arxiv.org/abs/2406.17335v3
- Date: Sat, 18 Jan 2025 02:04:15 GMT
- Title: A Thorough Performance Benchmarking on Lightweight Embedding-based Recommender Systems
- Authors: Hung Vinh Tran, Tong Chen, Quoc Viet Hung Nguyen, Zi Huang, Lizhen Cui, Hongzhi Yin,
- Abstract summary: State-of-the-art recommender systems (RSs) depend on categorical features, which ecoded by embedding vectors, resulting in excessively large embedding tables.
Despite the prosperity of lightweight embedding-based RSs, a wide diversity is seen in evaluation protocols.
This study investigates various LERS' performance, efficiency, and cross-task transferability via a thorough benchmarking process.
- Score: 67.52782366565658
- License:
- Abstract: Since the creation of the Web, recommender systems (RSs) have been an indispensable mechanism in information filtering. State-of-the-art RSs primarily depend on categorical features, which ecoded by embedding vectors, resulting in excessively large embedding tables. To prevent over-parameterized embedding tables from harming scalability, both academia and industry have seen increasing efforts in compressing RS embeddings. However, despite the prosperity of lightweight embedding-based RSs (LERSs), a wide diversity is seen in evaluation protocols, resulting in obstacles when relating LERS performance to real-world usability. Moreover, despite the common goal of lightweight embeddings, LERSs are evaluated with a single choice between the two main recommendation tasks -- collaborative filtering and content-based recommendation. This lack of discussions on cross-task transferability hinders the development of unified, more scalable solutions. Motivated by these issues, this study investigates various LERSs' performance, efficiency, and cross-task transferability via a thorough benchmarking process. Additionally, we propose an efficient embedding compression method using magnitude pruning, which is an easy-to-deploy yet highly competitive baseline that outperforms various complex LERSs. Our study reveals the distinct performance of LERSs across the two tasks, shedding light on their effectiveness and generalizability. To support edge-based recommendations, we tested all LERSs on a Raspberry Pi 4, where the efficiency bottleneck is exposed. Finally, we conclude this paper with critical summaries of LERS performance, model selection suggestions, and underexplored challenges around LERSs for future research. To encourage future research, we publish source codes and artifacts at \href{this link}{https://github.com/chenxing1999/recsys-benchmark}.
Related papers
- RoseRAG: Robust Retrieval-augmented Generation with Small-scale LLMs via Margin-aware Preference Optimization [53.63439735067081]
Large language models (LLMs) have achieved impressive performance but face high computational costs and latency.
Retrieval-augmented generation (RAG) helps by integrating external knowledge, but imperfect retrieval can introduce distracting noise that misleads SLMs.
We propose RoseRAG, a robust RAG framework for SLMs via Margin-aware Preference Optimization.
arXiv Detail & Related papers (2025-02-16T04:56:53Z) - Efficient and Responsible Adaptation of Large Language Models for Robust and Equitable Top-k Recommendations [12.814937243361054]
We propose a hybrid task allocation framework designed to promote social good by equitably serving all user groups.
Our results on three real-world datasets show a significant reduction in weak users and improved robustness to subpopulations without disproportionately escalating costs.
arXiv Detail & Related papers (2025-01-08T18:08:48Z) - A Systematic Examination of Preference Learning through the Lens of Instruction-Following [83.71180850955679]
We use a novel synthetic data generation pipeline to generate 48,000 instruction unique-following prompts.
With our synthetic prompts, we use two preference dataset curation methods - rejection sampling (RS) and Monte Carlo Tree Search (MCTS)
Experiments reveal that shared prefixes in preference pairs, as generated by MCTS, provide marginal but consistent improvements.
High-contrast preference pairs generally outperform low-contrast pairs; however, combining both often yields the best performance.
arXiv Detail & Related papers (2024-12-18T15:38:39Z) - CoPS: Empowering LLM Agents with Provable Cross-Task Experience Sharing [70.25689961697523]
We propose a generalizable algorithm that enhances sequential reasoning by cross-task experience sharing and selection.
Our work bridges the gap between existing sequential reasoning paradigms and validates the effectiveness of leveraging cross-task experiences.
arXiv Detail & Related papers (2024-10-22T03:59:53Z) - PEAR: Position-Embedding-Agnostic Attention Re-weighting Enhances Retrieval-Augmented Generation with Zero Inference Overhead [24.611413814466978]
Large language models (LLMs) enhanced with retrieval-augmented generation (RAG) have introduced a new paradigm for web search.
Existing methods to enhance context awareness are often inefficient, incurring time or memory overhead during inference.
We propose Position-Embedding-Agnostic attention Re-weighting (PEAR) which enhances the context awareness of LLMs with zero inference overhead.
arXiv Detail & Related papers (2024-09-29T15:40:54Z) - Efficient and Deployable Knowledge Infusion for Open-World Recommendations via Large Language Models [53.547190001324665]
We propose REKI to acquire two types of external knowledge about users and items from large language models (LLMs)
We develop individual knowledge extraction and collective knowledge extraction tailored for different scales of scenarios, effectively reducing offline resource consumption.
Experiments demonstrate that REKI outperforms state-of-the-art baselines and is compatible with lots of recommendation algorithms and tasks.
arXiv Detail & Related papers (2024-08-20T03:45:24Z) - Efficient and Responsible Adaptation of Large Language Models for Robust Top-k Recommendations [11.004673022505566]
Long user queries from millions of users can degrade the performance of large language models for recommendation.
We propose a hybrid task allocation framework that utilizes the capabilities of both large language models and traditional recommendation systems.
Our results on three real-world datasets show a significant reduction in weak users and improved robustness of RSs to sub-populations.
arXiv Detail & Related papers (2024-05-01T19:11:47Z) - Query Encoder Distillation via Embedding Alignment is a Strong Baseline
Method to Boost Dense Retriever Online Efficiency [4.254906060165999]
We show that even a 2-layer, BERT-based query encoder can still retain 92.5% of the full DE performance on the BEIR benchmark.
We hope that our findings will encourage the community to re-evaluate the trade-offs between method complexity and performance improvements.
arXiv Detail & Related papers (2023-06-05T06:53:55Z) - Choosing the Best of Both Worlds: Diverse and Novel Recommendations
through Multi-Objective Reinforcement Learning [68.45370492516531]
We introduce Scalarized Multi-Objective Reinforcement Learning (SMORL) for the Recommender Systems (RS) setting.
SMORL agent augments standard recommendation models with additional RL layers that enforce it to simultaneously satisfy three principal objectives: accuracy, diversity, and novelty of recommendations.
Our experimental results on two real-world datasets reveal a substantial increase in aggregate diversity, a moderate increase in accuracy, reduced repetitiveness of recommendations, and demonstrate the importance of reinforcing diversity and novelty as complementary objectives.
arXiv Detail & Related papers (2021-10-28T13:22:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.