Related papers: Balancing Embedding Spectrum for Recommendation

Balancing Embedding Spectrum for Recommendation

URL: http://arxiv.org/abs/2406.12032v1
Date: Mon, 17 Jun 2024 18:59:43 GMT
Title: Balancing Embedding Spectrum for Recommendation
Authors: Shaowen Peng, Kazunari Sugiyama, Xin Liu, Tsunenori Mine,
Abstract summary: We show that representations tend to span a subspace of the whole embedding space, leading to a suboptimal solution and reducing the model capacity. We propose a novel method called DirectSpec to balance the spectrum distribution of the embeddings during training. We also propose an enhanced variant, DirectSpec+, which employs self-paced gradients to optimize irrelevant samples more effectively.
Score: 7.523823738965443
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Modern recommender systems heavily rely on high-quality representations learned from high-dimensional sparse data. While significant efforts have been invested in designing powerful algorithms for extracting user preferences, the factors contributing to good representations have remained relatively unexplored. In this work, we shed light on an issue in the existing pair-wise learning paradigm (i.e., the embedding collapse problem), that the representations tend to span a subspace of the whole embedding space, leading to a suboptimal solution and reducing the model capacity. Specifically, optimization on observed interactions is equivalent to a low pass filter causing users/items to have the same representations and resulting in a complete collapse. While negative sampling acts as a high pass filter to alleviate the collapse by balancing the embedding spectrum, its effectiveness is only limited to certain losses, which still leads to an incomplete collapse. To tackle this issue, we propose a novel method called DirectSpec, acting as a reliable all pass filter to balance the spectrum distribution of the embeddings during training, ensuring that users/items effectively span the entire embedding space. Additionally, we provide a thorough analysis of DirectSpec from a decorrelation perspective and propose an enhanced variant, DirectSpec+, which employs self-paced gradients to optimize irrelevant samples more effectively. Moreover, we establish a close connection between DirectSpec+ and uniformity, demonstrating that contrastive learning (CL) can alleviate the collapse issue by indirectly balancing the spectrum. Finally, we implement DirectSpec and DirectSpec+ on two popular recommender models: MF and LightGCN. Our experimental results demonstrate its effectiveness and efficiency over competitive baselines.

Related papers

LoLA-SpecViT: Local Attention SwiGLU Vision Transformer with LoRA for Hyperspectral Imaging [6.360399841791849]
We propose textbfLoLA-SpecViT(Low-rank adaptation Local Attention Spectral Vision Transformer), a lightweight spectral vision transformer.<n>Our model combines a 3D convolutional spectral front-end with local window-based self-attention, enhancing both spectral feature extraction and spatial consistency.<n>Our framework provides a scalable and generalizable solution for real-world HSI applications in agriculture, environmental monitoring, and remote sensing analytics.
arXiv Detail & Related papers (2025-06-21T16:46:00Z)
Zero-Shot Hyperspectral Pansharpening Using Hysteresis-Based Tuning for Spectral Quality Control [5.231219025536678]
Methods for hyperspectral pansharpening often overlook the unique challenges posed by hyperspectral data fusion.<n>A single lightweight neural network is used, with weights that adapt on the fly to each band.<n>The proposed method is fully unsupervised, with no prior training on external data, flexible, and low-complexity.
arXiv Detail & Related papers (2025-05-22T13:24:24Z)
Unveiling Contrastive Learning's Capability of Neighborhood Aggregation for Collaborative Filtering [16.02820746003461]
graph contrastive learning (GCL) has gradually become a dominant approach in recommender systems. In this paper, we reveal via theoretical derivation that the gradient descent process of the CL objective is formally equivalent to graph convolution. We propose a novel neighborhood aggregation objective to bring users closer to all interacted items while pushing them away from other positive pairs.
arXiv Detail & Related papers (2025-04-14T11:22:41Z)
Supervised Optimism Correction: Be Confident When LLMs Are Sure [91.7459076316849]
We establish a novel theoretical connection between supervised fine-tuning and offline reinforcement learning. We show that the widely used beam search method suffers from unacceptable over-optimism. We propose Supervised Optimism Correction, which introduces a simple yet effective auxiliary loss for token-level $Q$-value estimations.
arXiv Detail & Related papers (2025-04-10T07:50:03Z)
Reward-Augmented Data Enhances Direct Preference Alignment of LLMs [56.24431208419858]
We introduce reward-conditioned Large Language Models (LLMs) that learn from the entire spectrum of response quality within the dataset. We propose an effective yet simple data relabeling method that conditions the preference pairs on quality scores to construct a reward-augmented dataset.
arXiv Detail & Related papers (2024-10-10T16:01:51Z)
Self-supervised Preference Optimization: Enhance Your Language Model with Preference Degree Awareness [27.43137305486112]
We propose a novel Self-supervised Preference Optimization (SPO) framework, which constructs a self-supervised preference degree loss combined with the alignment loss. The results demonstrate that SPO can be seamlessly integrated with existing preference optimization methods to achieve state-of-the-art performance.
arXiv Detail & Related papers (2024-09-26T12:37:26Z)
Prototypical Contrastive Learning through Alignment and Uniformity for Recommendation [6.790779112538357]
We present underlinePrototypical contrastive learning through underlineAlignment and underlineUniformity for recommendation. Specifically, we first propose prototypes as a latent space to ensure consistency across different augmentations from the origin graph. The absence of explicit negatives means that directly optimizing the consistency loss between instance and prototype could easily result in dimensional collapse issues.
arXiv Detail & Related papers (2024-02-03T08:19:26Z)
Modulate Your Spectrum in Self-Supervised Learning [65.963806450552]
Whitening loss offers a theoretical guarantee against feature collapse in self-supervised learning. We introduce Spectral Transformation (ST), a framework to modulate the spectrum of embedding. We propose a novel ST instance named IterNorm with trace loss (INTL)
arXiv Detail & Related papers (2023-05-26T09:59:48Z)
Fine-grained Retrieval Prompt Tuning [149.9071858259279]
Fine-grained Retrieval Prompt Tuning steers a frozen pre-trained model to perform the fine-grained retrieval task from the perspectives of sample prompt and feature adaptation. Our FRPT with fewer learnable parameters achieves the state-of-the-art performance on three widely-used fine-grained datasets.
arXiv Detail & Related papers (2022-07-29T04:10:04Z)
Where is the Grass Greener? Revisiting Generalized Policy Iteration for Offline Reinforcement Learning [81.15016852963676]
We re-implement state-of-the-art baselines in the offline RL regime under a fair, unified, and highly factorized framework. We show that when a given baseline outperforms its competing counterparts on one end of the spectrum, it never does on the other end.
arXiv Detail & Related papers (2021-07-03T11:00:56Z)
LSDAT: Low-Rank and Sparse Decomposition for Decision-based Adversarial Attack [74.5144793386864]
LSDAT crafts perturbations in the low-dimensional subspace formed by the sparse component of the input sample and that of an adversarial sample. LSD works directly in the image pixel domain to guarantee that non-$ell$ constraints, such as sparsity, are satisfied.
arXiv Detail & Related papers (2021-03-19T13:10:47Z)
Adversarial Filters of Dataset Biases [96.090959788952]
Large neural models have demonstrated human-level performance on language and vision benchmarks. Their performance degrades considerably on adversarial or out-of-distribution samples. We propose AFLite, which adversarially filters such dataset biases.
arXiv Detail & Related papers (2020-02-10T21:59:21Z)

This list is automatically generated from the titles and abstracts of the papers in this site.