Pseudo-Label Calibration Semi-supervised Multi-Modal Entity Alignment
- URL: http://arxiv.org/abs/2403.01203v1
- Date: Sat, 2 Mar 2024 12:44:59 GMT
- Title: Pseudo-Label Calibration Semi-supervised Multi-Modal Entity Alignment
- Authors: Luyao Wang and Pengnian Qi and Xigang Bao and Chunlai Zhou and Biao
Qin
- Abstract summary: Multi-modal entity alignment (MMEA) aims to identify equivalent entities between two multi-modal knowledge graphs for integration.
We introduce a Pseudo-label Multimodal Entity Alignment (PCMEA) in a semi-supervised way.
We combine momentum-based contrastive learning to make full use of the labeled and unlabeled data, which improves the quality of pseudo-label and pulls aligned entities closer.
- Score: 7.147651976133246
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Multi-modal entity alignment (MMEA) aims to identify equivalent entities
between two multi-modal knowledge graphs for integration. Unfortunately, prior
arts have attempted to improve the interaction and fusion of multi-modal
information, which have overlooked the influence of modal-specific noise and
the usage of labeled and unlabeled data in semi-supervised settings. In this
work, we introduce a Pseudo-label Calibration Multi-modal Entity Alignment
(PCMEA) in a semi-supervised way. Specifically, in order to generate holistic
entity representations, we first devise various embedding modules and attention
mechanisms to extract visual, structural, relational, and attribute features.
Different from the prior direct fusion methods, we next propose to exploit
mutual information maximization to filter the modal-specific noise and to
augment modal-invariant commonality. Then, we combine pseudo-label calibration
with momentum-based contrastive learning to make full use of the labeled and
unlabeled data, which improves the quality of pseudo-label and pulls aligned
entities closer. Finally, extensive experiments on two MMEA datasets
demonstrate the effectiveness of our PCMEA, which yields state-of-the-art
performance.
Related papers
- MCSFF: Multi-modal Consistency and Specificity Fusion Framework for Entity Alignment [7.109735168520378]
Multi-modal entity alignment (MMEA) is essential for enhancing knowledge graphs and improving question-answering systems.
Existing methods often focus on integrating modalities through their complementarity but overlook the specificity of each modality.
We propose the Multi-modal Consistency and Specificity Fusion Framework (MCSFF), which innovatively integrates both complementary and specific aspects of modalities.
arXiv Detail & Related papers (2024-10-18T16:35:25Z) - IBMEA: Exploring Variational Information Bottleneck for Multi-modal Entity Alignment [17.570243718626994]
Multi-modal entity alignment (MMEA) aims to identify equivalent entities between multi-modal knowledge graphs (MMKGs)
We devise multi-modal variational encoders to generate modal-specific entity representations as probability distributions.
We also propose four modal-specific information bottleneck regularizers, limiting the misleading clues in refining modal-specific entity representations.
arXiv Detail & Related papers (2024-07-27T17:12:37Z) - Dual-Decoupling Learning and Metric-Adaptive Thresholding for Semi-Supervised Multi-Label Learning [81.83013974171364]
Semi-supervised multi-label learning (SSMLL) is a powerful framework for leveraging unlabeled data to reduce the expensive cost of collecting precise multi-label annotations.
Unlike semi-supervised learning, one cannot select the most probable label as the pseudo-label in SSMLL due to multiple semantics contained in an instance.
We propose a dual-perspective method to generate high-quality pseudo-labels.
arXiv Detail & Related papers (2024-07-26T09:33:53Z) - NativE: Multi-modal Knowledge Graph Completion in the Wild [51.80447197290866]
We propose a comprehensive framework NativE to achieve MMKGC in the wild.
NativE proposes a relation-guided dual adaptive fusion module that enables adaptive fusion for any modalities.
We construct a new benchmark called WildKGC with five datasets to evaluate our method.
arXiv Detail & Related papers (2024-03-28T03:04:00Z) - Joint Multimodal Transformer for Emotion Recognition in the Wild [49.735299182004404]
Multimodal emotion recognition (MMER) systems typically outperform unimodal systems.
This paper proposes an MMER method that relies on a joint multimodal transformer (JMT) for fusion with key-based cross-attention.
arXiv Detail & Related papers (2024-03-15T17:23:38Z) - Exploiting Modality-Specific Features For Multi-Modal Manipulation
Detection And Grounding [54.49214267905562]
We construct a transformer-based framework for multi-modal manipulation detection and grounding tasks.
Our framework simultaneously explores modality-specific features while preserving the capability for multi-modal alignment.
We propose an implicit manipulation query (IMQ) that adaptively aggregates global contextual cues within each modality.
arXiv Detail & Related papers (2023-09-22T06:55:41Z) - Correlation-Aware Mutual Learning for Semi-supervised Medical Image
Segmentation [5.045813144375637]
Most existing semi-supervised segmentation methods only focus on extracting information from unlabeled data.
We propose a novel Correlation Aware Mutual Learning framework that leverages labeled data to guide the extraction of information from unlabeled data.
Our approach is based on a mutual learning strategy that incorporates two modules: the Cross-sample Mutual Attention Module (CMA) and the Omni-Correlation Consistency Module (OCC)
arXiv Detail & Related papers (2023-07-12T17:20:05Z) - Multimodal Learning Without Labeled Multimodal Data: Guarantees and Applications [90.6849884683226]
We study the challenge of interaction quantification in a semi-supervised setting with only labeled unimodal data.
Using a precise information-theoretic definition of interactions, our key contribution is the derivation of lower and upper bounds.
We show how these theoretical results can be used to estimate multimodal model performance, guide data collection, and select appropriate multimodal models for various tasks.
arXiv Detail & Related papers (2023-06-07T15:44:53Z) - MEAformer: Multi-modal Entity Alignment Transformer for Meta Modality
Hybrid [40.745848169903105]
Multi-modal entity alignment (MMEA) aims to discover identical entities across different knowledge graphs.
MMEA algorithms rely on KG-level modality fusion strategies for multi-modal entity representation.
This paper introduces MEAformer, a multi-modal entity alignment transformer approach for meta modality hybrid.
arXiv Detail & Related papers (2022-12-29T20:49:58Z) - Bi-Bimodal Modality Fusion for Correlation-Controlled Multimodal
Sentiment Analysis [96.46952672172021]
Bi-Bimodal Fusion Network (BBFN) is a novel end-to-end network that performs fusion on pairwise modality representations.
Model takes two bimodal pairs as input due to known information imbalance among modalities.
arXiv Detail & Related papers (2021-07-28T23:33:42Z) - Self-Supervised Multimodal Domino: in Search of Biomarkers for
Alzheimer's Disease [19.86082635340699]
We propose a taxonomy of all reasonable ways to organize self-supervised representation-learning algorithms.
We first evaluate models on toy multimodal MNIST datasets and then apply them to a multimodal neuroimaging dataset with Alzheimer's disease patients.
Results show that the proposed approach outperforms previous self-supervised encoder-decoder methods.
arXiv Detail & Related papers (2020-12-25T20:28:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.