Mining Stable Preferences: Adaptive Modality Decorrelation for
Multimedia Recommendation
- URL: http://arxiv.org/abs/2306.14179v1
- Date: Sun, 25 Jun 2023 09:09:11 GMT
- Title: Mining Stable Preferences: Adaptive Modality Decorrelation for
Multimedia Recommendation
- Authors: Jinghao Zhang, Qiang Liu, Shu Wu, Liang Wang
- Abstract summary: We propose a novel MOdality DEcorrelating STable learning framework, MODEST for brevity, to learn users' stable preference.
Inspired by sample re-weighting techniques, the proposed method aims to estimate a weight for each item, such that the features from different modalities in the weighted distribution are decorrelated.
Our method could be served as a play-and-plug module for existing multimedia recommendation backbones.
- Score: 23.667430143035787
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Multimedia content is of predominance in the modern Web era. In real
scenarios, multiple modalities reveal different aspects of item attributes and
usually possess different importance to user purchase decisions. However, it is
difficult for models to figure out users' true preference towards different
modalities since there exists strong statistical correlation between
modalities. Even worse, the strong statistical correlation might mislead models
to learn the spurious preference towards inconsequential modalities. As a
result, when data (modal features) distribution shifts, the learned spurious
preference might not guarantee to be as effective on the inference set as on
the training set. We propose a novel MOdality DEcorrelating STable learning
framework, MODEST for brevity, to learn users' stable preference. Inspired by
sample re-weighting techniques, the proposed method aims to estimate a weight
for each item, such that the features from different modalities in the weighted
distribution are decorrelated. We adopt Hilbert Schmidt Independence Criterion
(HSIC) as independence testing measure which is a kernel-based method capable
of evaluating the correlation degree between two multi-dimensional and
non-linear variables. Our method could be served as a play-and-plug module for
existing multimedia recommendation backbones. Extensive experiments on four
public datasets and four state-of-the-art multimedia recommendation backbones
unequivocally show that our proposed method can improve the performances by a
large margin.
Related papers
- Diversified Batch Selection for Training Acceleration [68.67164304377732]
A prevalent research line, known as online batch selection, explores selecting informative subsets during the training process.
vanilla reference-model-free methods involve independently scoring and selecting data in a sample-wise manner.
We propose Diversified Batch Selection (DivBS), which is reference-model-free and can efficiently select diverse and representative samples.
arXiv Detail & Related papers (2024-06-07T12:12:20Z) - U3M: Unbiased Multiscale Modal Fusion Model for Multimodal Semantic Segmentation [63.31007867379312]
We introduce U3M: An Unbiased Multiscale Modal Fusion Model for Multimodal Semantics.
We employ feature fusion at multiple scales to ensure the effective extraction and integration of both global and local features.
Experimental results demonstrate that our approach achieves superior performance across multiple datasets.
arXiv Detail & Related papers (2024-05-24T08:58:48Z) - Debiasing Multimodal Models via Causal Information Minimization [65.23982806840182]
We study bias arising from confounders in a causal graph for multimodal data.
Robust predictive features contain diverse information that helps a model generalize to out-of-distribution data.
We use these features as confounder representations and use them via methods motivated by causal theory to remove bias from models.
arXiv Detail & Related papers (2023-11-28T16:46:14Z) - Data-driven Preference Learning Methods for Sorting Problems with
Multiple Temporal Criteria [17.673512636899076]
This study presents novel preference learning approaches to multiple criteria sorting problems in the presence of temporal criteria.
To enhance scalability and accommodate learnable time discount factors, we introduce a novel monotonic Recurrent Neural Network (mRNN)
The proposed mRNN can describe the preference dynamics by depicting marginal value functions and personalized time discount factors along with time.
arXiv Detail & Related papers (2023-09-22T05:08:52Z) - SUMMIT: Source-Free Adaptation of Uni-Modal Models to Multi-Modal
Targets [30.262094419776208]
Current approaches assume that the source data is available during adaptation and that the source consists of paired multi-modal data.
We propose a switching framework which automatically chooses between two complementary methods of cross-modal pseudo-label fusion.
Our method achieves an improvement in mIoU of up to 12% over competing baselines.
arXiv Detail & Related papers (2023-08-23T02:57:58Z) - Generalizing Multimodal Variational Methods to Sets [35.69942798534849]
This paper presents a novel variational method on sets called the Set Multimodal VAE (SMVAE) for learning a multimodal latent space.
By modeling the joint-modality posterior distribution directly, the proposed SMVAE learns to exchange information between multiple modalities and compensate for the drawbacks caused by factorization.
arXiv Detail & Related papers (2022-12-19T23:50:19Z) - Adaptive Contrastive Learning on Multimodal Transformer for Review
Helpfulness Predictions [40.70793282367128]
We propose Multimodal Contrastive Learning for Multimodal Review Helpfulness Prediction (MRHP) problem.
In addition, we introduce Adaptive Weighting scheme for our contrastive learning approach.
Finally, we propose Multimodal Interaction module to address the unalignment nature of multimodal data.
arXiv Detail & Related papers (2022-11-07T13:05:56Z) - An Additive Instance-Wise Approach to Multi-class Model Interpretation [53.87578024052922]
Interpretable machine learning offers insights into what factors drive a certain prediction of a black-box system.
Existing methods mainly focus on selecting explanatory input features, which follow either locally additive or instance-wise approaches.
This work exploits the strengths of both methods and proposes a global framework for learning local explanations simultaneously for multiple target classes.
arXiv Detail & Related papers (2022-07-07T06:50:27Z) - Towards Model-Agnostic Post-Hoc Adjustment for Balancing Ranking
Fairness and Algorithm Utility [54.179859639868646]
Bipartite ranking aims to learn a scoring function that ranks positive individuals higher than negative ones from labeled data.
There have been rising concerns on whether the learned scoring function can cause systematic disparity across different protected groups.
We propose a model post-processing framework for balancing them in the bipartite ranking scenario.
arXiv Detail & Related papers (2020-06-15T10:08:39Z) - Meta-Learned Confidence for Few-shot Learning [60.6086305523402]
A popular transductive inference technique for few-shot metric-based approaches, is to update the prototype of each class with the mean of the most confident query examples.
We propose to meta-learn the confidence for each query sample, to assign optimal weights to unlabeled queries.
We validate our few-shot learning model with meta-learned confidence on four benchmark datasets.
arXiv Detail & Related papers (2020-02-27T10:22:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.