SANCL: Multimodal Review Helpfulness Prediction with Selective Attention
and Natural Contrastive Learning
- URL: http://arxiv.org/abs/2209.05040v3
- Date: Wed, 14 Sep 2022 15:42:56 GMT
- Title: SANCL: Multimodal Review Helpfulness Prediction with Selective Attention
and Natural Contrastive Learning
- Authors: Wei Han, Hui Chen, Zhen Hai, Soujanya Poria, Lidong Bing
- Abstract summary: Multimodal Review Helpfulness Prediction (MRHP) aims to sort product reviews according to the predicted helpfulness scores.
Previous work on this task focuses on attention-based modality fusion, information integration, and relation modeling.
We propose SANCL: Selective Attention and Natural Contrastive Learning for MRHP.
- Score: 41.92038829041499
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: With the boom of e-commerce, Multimodal Review Helpfulness Prediction (MRHP),
which aims to sort product reviews according to the predicted helpfulness
scores has become a research hotspot. Previous work on this task focuses on
attention-based modality fusion, information integration, and relation
modeling, which primarily exposes the following drawbacks: 1) the model may
fail to capture the really essential information due to its indiscriminate
attention formulation; 2) lack appropriate modeling methods that take full
advantage of correlation among provided data. In this paper, we propose SANCL:
Selective Attention and Natural Contrastive Learning for MRHP. SANCL adopts a
probe-based strategy to enforce high attention weights on the regions of
greater significance. It also constructs a contrastive learning framework based
on natural matching properties in the dataset. Experimental results on two
benchmark datasets with three categories show that SANCL achieves
state-of-the-art baseline performance with lower memory consumption.
Related papers
- On Machine Learning Approaches for Protein-Ligand Binding Affinity Prediction [2.874893537471256]
This study evaluates the performance of classical tree-based models and advanced neural networks in protein-ligand binding affinity prediction.
We show that combining 2D and 3D model strengths improves active learning outcomes beyond current state-of-the-art approaches.
arXiv Detail & Related papers (2024-07-15T13:06:00Z) - ACTRESS: Active Retraining for Semi-supervised Visual Grounding [52.08834188447851]
A previous study, RefTeacher, makes the first attempt to tackle this task by adopting the teacher-student framework to provide pseudo confidence supervision and attention-based supervision.
This approach is incompatible with current state-of-the-art visual grounding models, which follow the Transformer-based pipeline.
Our paper proposes the ACTive REtraining approach for Semi-Supervised Visual Grounding, abbreviated as ACTRESS.
arXiv Detail & Related papers (2024-07-03T16:33:31Z) - VALOR-EVAL: Holistic Coverage and Faithfulness Evaluation of Large Vision-Language Models [57.43276586087863]
Large Vision-Language Models (LVLMs) suffer from hallucination issues, wherein the models generate plausible-sounding but factually incorrect outputs.
Existing benchmarks are often limited in scope, focusing mainly on object hallucinations.
We introduce a multi-dimensional benchmark covering objects, attributes, and relations, with challenging images selected based on associative biases.
arXiv Detail & Related papers (2024-04-22T04:49:22Z) - Separating common from salient patterns with Contrastive Representation
Learning [2.250968907999846]
Contrastive Analysis aims at separating common factors of variation between two datasets.
Current models based on Variational Auto-Encoders have shown poor performance in learning semantically-expressive representations.
We propose to leverage the ability of Contrastive Learning to learn semantically expressive representations well adapted for Contrastive Analysis.
arXiv Detail & Related papers (2024-02-19T08:17:13Z) - Mitigating Spurious Correlations in Multi-modal Models during
Fine-tuning [18.45898471459533]
Spurious correlations that degrade model generalization or lead the model to be right for the wrong reasons are one of the main robustness concerns for real-world deployments.
This paper proposes a novel approach to address spurious correlations during fine-tuning for a given domain of interest.
arXiv Detail & Related papers (2023-04-08T05:20:33Z) - SAIS: Supervising and Augmenting Intermediate Steps for Document-Level
Relation Extraction [51.27558374091491]
We propose to explicitly teach the model to capture relevant contexts and entity types by supervising and augmenting intermediate steps (SAIS) for relation extraction.
Based on a broad spectrum of carefully designed tasks, our proposed SAIS method not only extracts relations of better quality due to more effective supervision, but also retrieves the corresponding supporting evidence more accurately.
arXiv Detail & Related papers (2021-09-24T17:37:35Z) - How Knowledge Graph and Attention Help? A Quantitative Analysis into
Bag-level Relation Extraction [66.09605613944201]
We quantitatively evaluate the effect of attention and Knowledge Graph on bag-level relation extraction (RE)
We find that (1) higher attention accuracy may lead to worse performance as it may harm the model's ability to extract entity mention features; (2) the performance of attention is largely influenced by various noise distribution patterns; and (3) KG-enhanced attention indeed improves RE performance, while not through enhanced attention but by incorporating entity prior.
arXiv Detail & Related papers (2021-07-26T09:38:28Z) - More Than Just Attention: Learning Cross-Modal Attentions with
Contrastive Constraints [63.08768589044052]
We propose Contrastive Content Re-sourcing ( CCR) and Contrastive Content Swapping ( CCS) constraints to address such limitation.
CCR and CCS constraints supervise the training of attention models in a contrastive learning manner without requiring explicit attention annotations.
Experiments on both Flickr30k and MS-COCO datasets demonstrate that integrating these attention constraints into two state-of-the-art attention-based models improves the model performance.
arXiv Detail & Related papers (2021-05-20T08:48:10Z) - Few-Shot Relation Learning with Attention for EEG-based Motor Imagery
Classification [11.873435088539459]
Brain-Computer Interfaces (BCI) based on Electroencephalography (EEG) signals have received a lot of attention.
Motor imagery (MI) data can be used to aid rehabilitation as well as in autonomous driving scenarios.
classification of MI signals is vital for EEG-based BCI systems.
arXiv Detail & Related papers (2020-03-03T02:34:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.