Related papers: Tail-Aware Data Augmentation for Long-Tail Sequential Recommendation

Tail-Aware Data Augmentation for Long-Tail Sequential Recommendation

URL: http://arxiv.org/abs/2601.10933v1
Date: Fri, 16 Jan 2026 01:29:36 GMT
Title: Tail-Aware Data Augmentation for Long-Tail Sequential Recommendation
Authors: Yizhou Dang, Zhifu Wei, Minhan Huang, Lianbo Ma, Jianzhe Zhao, Guibing Guo, Xingwei Wang,
Abstract summary: Sequential recommendation (SR) learns user preferences based on their historical interaction sequences and provides personalized suggestions.<n>We propose Tail-Aware Data Augmentation (TADA) for long-tail sequential recommendation.
Score: 25.19179606094266
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Sequential recommendation (SR) learns user preferences based on their historical interaction sequences and provides personalized suggestions. In real-world scenarios, most users can only interact with a handful of items, while the majority of items are seldom consumed. This pervasive long-tail challenge limits the model's ability to learn user preferences. Despite previous efforts to enrich tail items/users with knowledge from head parts or improve tail learning through additional contextual information, they still face the following issues: 1) They struggle to improve the situation where interactions of tail users/items are scarce, leading to incomplete preferences learning for the tail parts. 2) Existing methods often degrade overall or head parts performance when improving accuracy for tail users/items, thereby harming the user experience. We propose Tail-Aware Data Augmentation (TADA) for long-tail sequential recommendation, which enhances the interaction frequency for tail items/users while maintaining head performance, thereby promoting the model's learning capabilities for the tail. Specifically, we first capture the co-occurrence and correlation among low-popularity items by a linear model. Building upon this, we design two tail-aware augmentation operators, T-Substitute and T-Insert. The former replaces the head item with a relevant item, while the latter utilizes co-occurrence relationships to extend the original sequence by incorporating both head and tail items. The augmented and original sequences are mixed at the representation level to preserve preference knowledge. We further extend the mix operation across different tail-user sequences and augmented sequences to generate richer augmented samples, thereby improving tail performance. Comprehensive experiments demonstrate the superiority of our method. The codes are provided at https://github.com/KingGugu/TADA.

Related papers

Multi-granularity Interest Retrieval and Refinement Network for Long-Term User Behavior Modeling in CTR Prediction [68.90783662117936]
Click-through Rate (CTR) prediction is crucial for online personalization platforms.<n>Recent advancements have shown that modeling rich user behaviors can significantly improve the performance of CTR prediction.<n>We propose Multi-granularity Interest Retrieval and Refinement Network (MIRRN)
arXiv Detail & Related papers (2024-11-22T15:29:05Z)
Head-Tail Cooperative Learning Network for Unbiased Scene Graph Generation [30.467562472064177]
Current unbiased Scene Graph Generation (SGG) methods ignore the substantial sacrifice in the prediction of head predicates. We propose a model-agnostic Head-Tail Collaborative Learning network that includes head-prefer and tail-prefer feature representation branches. Our method achieves higher mean Recall with a minimal sacrifice in Recall and achieves a new state-of-the-art overall performance.
arXiv Detail & Related papers (2023-08-23T10:29:25Z)
Feature Fusion from Head to Tail for Long-Tailed Visual Recognition [39.86973663532936]
The biased decision boundary caused by inadequate semantic information in tail classes is one of the key factors contributing to their low recognition accuracy. We propose to augment tail classes by grafting the diverse semantic information from head classes, referred to as head-to-tail fusion (H2T) Both theoretical analysis and practical experimentation demonstrate that H2T can contribute to a more optimized solution for the decision boundary.
arXiv Detail & Related papers (2023-06-12T08:50:46Z)
Personalizing Intervened Network for Long-tailed Sequential User Behavior Modeling [66.02953670238647]
Tail users suffer from significantly lower-quality recommendation than the head users after joint training. A model trained on tail users separately still achieve inferior results due to limited data. We propose a novel approach that significantly improves the recommendation performance of the tail users.
arXiv Detail & Related papers (2022-08-19T02:50:19Z)
Dual-branch Hybrid Learning Network for Unbiased Scene Graph Generation [87.13847750383778]
We propose a Dual-branch Hybrid Learning network (DHL) to take care of both head predicates and tail ones for Scene Graph Generation (SGG) We show that our approach achieves a new state-of-the-art performance on VG and GQA datasets.
arXiv Detail & Related papers (2022-07-16T11:53:50Z)
Long-tailed Recognition by Learning from Latent Categories [70.6272114218549]
We introduce a Latent Categories based long-tail Recognition (LCReg) method. Specifically, we learn a set of class-agnostic latent features shared among the head and tail classes. Then, we implicitly enrich the training sample diversity via applying semantic data augmentation to the latent features.
arXiv Detail & Related papers (2022-06-02T12:19:51Z)
Learning Transferrable Parameters for Long-tailed Sequential User Behavior Modeling [70.64257515361972]
We argue that focusing on tail users could bring more benefits and address the long tails issue. Specifically, we propose a gradient alignment and adopt an adversarial training scheme to facilitate knowledge transfer from the head to the tail.
arXiv Detail & Related papers (2020-10-22T03:12:02Z)
Long-tailed Recognition by Routing Diverse Distribution-Aware Experts [64.71102030006422]
We propose a new long-tailed classifier called RoutIng Diverse Experts (RIDE) It reduces the model variance with multiple experts, reduces the model bias with a distribution-aware diversity loss, reduces the computational cost with a dynamic expert routing module. RIDE outperforms the state-of-the-art by 5% to 7% on CIFAR100-LT, ImageNet-LT and iNaturalist 2018 benchmarks.
arXiv Detail & Related papers (2020-10-05T06:53:44Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.