Multi-Margin Loss: Proposal and Application in Recommender Systems
- URL: http://arxiv.org/abs/2405.04614v2
- Date: Sun, 23 Jun 2024 22:05:46 GMT
- Title: Multi-Margin Loss: Proposal and Application in Recommender Systems
- Authors: Makbule Gulcin Ozsoy,
- Abstract summary: Collaborative filtering-based deep learning techniques have regained popularity due to their simplicity.
The proposed Multi-Margin Loss (MML) addresses these challenges by introducing multiple margins and varying weights for negative samples.
MML achieves up to a 20% performance improvement compared to a baseline contrastive loss function with fewer negative samples.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Recommender systems guide users through vast amounts of information by suggesting items based on their predicted preferences. Collaborative filtering-based deep learning techniques have regained popularity due to their simplicity, using only user-item interactions. Typically, these systems consist of three main components: an interaction module, a loss function, and a negative sampling strategy. Initially, researchers focused on enhancing performance by developing complex interaction modules with techniques like multi-layer perceptrons, transformers, or graph neural networks. However, there has been a recent shift toward refining loss functions and negative sampling strategies. This shift has increased interest in contrastive learning, which pulls similar pairs closer while pushing dissimilar ones apart. Contrastive learning involves key practices such as heavy data augmentation, large batch sizes, and hard-negative sampling, but these also bring challenges like high memory demands and under-utilization of some negative samples. The proposed Multi-Margin Loss (MML) addresses these challenges by introducing multiple margins and varying weights for negative samples. MML efficiently utilizes not only the hardest negatives but also other non-trivial negatives, offering a simpler yet effective loss function that outperforms more complex methods, especially when resources are limited. Experiments on two well-known datasets showed MML achieved up to a 20\% performance improvement compared to a baseline contrastive loss function with fewer negative samples.
Related papers
- Rethinking Negative Pairs in Code Search [56.23857828689406]
We propose a simple yet effective Soft-InfoNCE loss that inserts weight terms into InfoNCE.
We analyze the effects of Soft-InfoNCE on controlling the distribution of learnt code representations and on deducing a more precise mutual information estimation.
arXiv Detail & Related papers (2023-10-12T06:32:42Z) - Siamese Prototypical Contrastive Learning [24.794022951873156]
Contrastive Self-supervised Learning (CSL) is a practical solution that learns meaningful visual representations from massive data in an unsupervised approach.
In this paper, we tackle this problem by introducing a simple but effective contrastive learning framework.
The key insight is to employ siamese-style metric loss to match intra-prototype features, while increasing the distance between inter-prototype features.
arXiv Detail & Related papers (2022-08-18T13:25:30Z) - Generating Negative Samples for Sequential Recommendation [83.60655196391855]
We propose to Generate Negative Samples (items) for Sequential Recommendation (SR)
A negative item is sampled at each time step based on the current SR model's learned user preferences toward items.
Experiments on four public datasets verify the importance of providing high-quality negative samples for SR.
arXiv Detail & Related papers (2022-08-07T05:44:13Z) - C$^{4}$Net: Contextual Compression and Complementary Combination Network
for Salient Object Detection [0.0]
We show that feature concatenation works better than other combination methods like multiplication or addition.
Also, joint feature learning gives better results, because of the information sharing during their processing.
arXiv Detail & Related papers (2021-10-22T16:14:10Z) - Rethinking Deep Contrastive Learning with Embedding Memory [58.66613563148031]
Pair-wise loss functions have been extensively studied and shown to continuously improve the performance of deep metric learning (DML)
We provide a new methodology for systematically studying weighting strategies of various pair-wise loss functions, and rethink pair weighting with an embedding memory.
arXiv Detail & Related papers (2021-03-25T17:39:34Z) - Contrastive Learning with Hard Negative Samples [80.12117639845678]
We develop a new family of unsupervised sampling methods for selecting hard negative samples.
A limiting case of this sampling results in a representation that tightly clusters each class, and pushes different classes as far apart as possible.
The proposed method improves downstream performance across multiple modalities, requires only few additional lines of code to implement, and introduces no computational overhead.
arXiv Detail & Related papers (2020-10-09T14:18:53Z) - Simplify and Robustify Negative Sampling for Implicit Collaborative
Filtering [42.832851785261894]
In this paper, we first provide a novel understanding of negative instances by empirically observing that only a few instances are potentially important for model learning.
We then tackle the untouched false negative problem by favouring high-variance samples stored in memory.
Empirical results on two synthetic datasets and three real-world datasets demonstrate both robustness and superiorities of our negative sampling method.
arXiv Detail & Related papers (2020-09-07T19:08:26Z) - Multi-Scale Positive Sample Refinement for Few-Shot Object Detection [61.60255654558682]
Few-shot object detection (FSOD) helps detectors adapt to unseen classes with few training instances.
We propose a Multi-scale Positive Sample Refinement (MPSR) approach to enrich object scales in FSOD.
MPSR generates multi-scale positive samples as object pyramids and refines the prediction at various scales.
arXiv Detail & Related papers (2020-07-18T09:48:29Z) - Multi-scale Interactive Network for Salient Object Detection [91.43066633305662]
We propose the aggregate interaction modules to integrate the features from adjacent levels.
To obtain more efficient multi-scale features, the self-interaction modules are embedded in each decoder unit.
Experimental results on five benchmark datasets demonstrate that the proposed method without any post-processing performs favorably against 23 state-of-the-art approaches.
arXiv Detail & Related papers (2020-07-17T15:41:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.