Visual Commonsense based Heterogeneous Graph Contrastive Learning
- URL: http://arxiv.org/abs/2311.06553v1
- Date: Sat, 11 Nov 2023 12:01:18 GMT
- Title: Visual Commonsense based Heterogeneous Graph Contrastive Learning
- Authors: Zongzhao Li, Xiangyu Zhu, Xi Zhang, Zhaoxiang Zhang, Zhen Lei
- Abstract summary: We propose a heterogeneous graph contrastive learning method to better finish the visual reasoning task.
Our method is designed as a plug-and-play way, so that it can be quickly and easily combined with a wide range of representative methods.
- Score: 79.22206720896664
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: How to select relevant key objects and reason about the complex relationships
cross vision and linguistic domain are two key issues in many multi-modality
applications such as visual question answering (VQA). In this work, we
incorporate the visual commonsense information and propose a heterogeneous
graph contrastive learning method to better finish the visual reasoning task.
Our method is designed as a plug-and-play way, so that it can be quickly and
easily combined with a wide range of representative methods. Specifically, our
model contains two key components: the Commonsense-based Contrastive Learning
and the Graph Relation Network. Using contrastive learning, we guide the model
concentrate more on discriminative objects and relevant visual commonsense
attributes. Besides, thanks to the introduction of the Graph Relation Network,
the model reasons about the correlations between homogeneous edges and the
similarities between heterogeneous edges, which makes information transmission
more effective. Extensive experiments on four benchmarks show that our method
greatly improves seven representative VQA models, demonstrating its
effectiveness and generalizability.
Related papers
- Separating common from salient patterns with Contrastive Representation
Learning [2.250968907999846]
Contrastive Analysis aims at separating common factors of variation between two datasets.
Current models based on Variational Auto-Encoders have shown poor performance in learning semantically-expressive representations.
We propose to leverage the ability of Contrastive Learning to learn semantically expressive representations well adapted for Contrastive Analysis.
arXiv Detail & Related papers (2024-02-19T08:17:13Z) - Entropy Neural Estimation for Graph Contrastive Learning [9.032721248598088]
Contrastive learning on graphs aims at extracting distinguishable high-level representations of nodes.
We propose a simple yet effective subset sampling strategy to contrast pairwise representations between views of a dataset.
We conduct extensive experiments on seven graph benchmarks, and the proposed approach achieves competitive performance.
arXiv Detail & Related papers (2023-07-26T03:55:08Z) - Cross-view Graph Contrastive Representation Learning on Partially
Aligned Multi-view Data [52.491074276133325]
Multi-view representation learning has developed rapidly over the past decades and has been applied in many fields.
We propose a new cross-view graph contrastive learning framework, which integrates multi-view information to align data and learn latent representations.
Experiments conducted on several real datasets demonstrate the effectiveness of the proposed method on the clustering and classification tasks.
arXiv Detail & Related papers (2022-11-08T09:19:32Z) - Visual Perturbation-aware Collaborative Learning for Overcoming the
Language Prior Problem [60.0878532426877]
We propose a novel collaborative learning scheme from the viewpoint of visual perturbation calibration.
Specifically, we devise a visual controller to construct two sorts of curated images with different perturbation extents.
The experimental results on two diagnostic VQA-CP benchmark datasets evidently demonstrate its effectiveness.
arXiv Detail & Related papers (2022-07-24T23:50:52Z) - ACTIVE:Augmentation-Free Graph Contrastive Learning for Partial
Multi-View Clustering [52.491074276133325]
We propose an augmentation-free graph contrastive learning framework to solve the problem of partial multi-view clustering.
The proposed approach elevates instance-level contrastive learning and missing data inference to the cluster-level, effectively mitigating the impact of individual missing data on clustering.
arXiv Detail & Related papers (2022-03-01T02:32:25Z) - Joint Graph Learning and Matching for Semantic Feature Correspondence [69.71998282148762]
We propose a joint emphgraph learning and matching network, named GLAM, to explore reliable graph structures for boosting graph matching.
The proposed method is evaluated on three popular visual matching benchmarks (Pascal VOC, Willow Object and SPair-71k)
It outperforms previous state-of-the-art graph matching methods by significant margins on all benchmarks.
arXiv Detail & Related papers (2021-09-01T08:24:02Z) - Deep Contrastive Learning for Multi-View Network Embedding [20.035449838566503]
Multi-view network embedding aims at projecting nodes in the network to low-dimensional vectors.
Most contrastive learning-based methods mostly rely on high-quality graph embedding.
We design a novel node-to-node Contrastive learning framework for Multi-view network Embedding (CREME)
arXiv Detail & Related papers (2021-08-16T06:29:18Z) - Group Contrastive Self-Supervised Learning on Graphs [101.45974132613293]
We study self-supervised learning on graphs using contrastive methods.
We argue that contrasting graphs in multiple subspaces enables graph encoders to capture more abundant characteristics.
arXiv Detail & Related papers (2021-07-20T22:09:21Z) - Mutual Graph Learning for Camouflaged Object Detection [31.422775969808434]
A major challenge is that intrinsic similarities between foreground objects and background surroundings make the features extracted by deep model indistinguishable.
We design a novel Mutual Graph Learning model, which generalizes the idea of conventional mutual learning from regular grids to the graph domain.
In contrast to most mutual learning approaches that use a shared function to model all between-task interactions, MGL is equipped with typed functions for handling different complementary relations.
arXiv Detail & Related papers (2021-04-03T10:14:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.