Related papers: Pseudo-Label Calibration Semi-supervised Multi-Modal Entity Alignment

Pseudo-Label Calibration Semi-supervised Multi-Modal Entity Alignment

URL: http://arxiv.org/abs/2403.01203v1
Date: Sat, 2 Mar 2024 12:44:59 GMT
Title: Pseudo-Label Calibration Semi-supervised Multi-Modal Entity Alignment
Authors: Luyao Wang and Pengnian Qi and Xigang Bao and Chunlai Zhou and Biao Qin
Abstract summary: Multi-modal entity alignment (MMEA) aims to identify equivalent entities between two multi-modal knowledge graphs for integration. We introduce a Pseudo-label Multimodal Entity Alignment (PCMEA) in a semi-supervised way. We combine momentum-based contrastive learning to make full use of the labeled and unlabeled data, which improves the quality of pseudo-label and pulls aligned entities closer.
Score: 7.147651976133246
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Multi-modal entity alignment (MMEA) aims to identify equivalent entities between two multi-modal knowledge graphs for integration. Unfortunately, prior arts have attempted to improve the interaction and fusion of multi-modal information, which have overlooked the influence of modal-specific noise and the usage of labeled and unlabeled data in semi-supervised settings. In this work, we introduce a Pseudo-label Calibration Multi-modal Entity Alignment (PCMEA) in a semi-supervised way. Specifically, in order to generate holistic entity representations, we first devise various embedding modules and attention mechanisms to extract visual, structural, relational, and attribute features. Different from the prior direct fusion methods, we next propose to exploit mutual information maximization to filter the modal-specific noise and to augment modal-invariant commonality. Then, we combine pseudo-label calibration with momentum-based contrastive learning to make full use of the labeled and unlabeled data, which improves the quality of pseudo-label and pulls aligned entities closer. Finally, extensive experiments on two MMEA datasets demonstrate the effectiveness of our PCMEA, which yields state-of-the-art performance.

Related papers

Complementarity-driven Representation Learning for Multi-modal Knowledge Graph Completion [0.0]
We propose a novel framework named Mixture of Complementary Modality Experts (MoCME)<n>MoCME consists of a Complementarity-guided Modality Knowledge Fusion (CMKF) module and an Entropy-guided Negative Sampling (EGNS) mechanism.<n>Our MoCME achieves state-of-the-art performance, surpassing existing approaches.
arXiv Detail & Related papers (2025-07-28T08:35:11Z)
BiXFormer: A Robust Framework for Maximizing Modality Effectiveness in Multi-Modal Semantic Segmentation [55.486872677160015]
We reformulate multi-modal semantic segmentation as a mask-level classification task.<n>We propose BiXFormer, which integrates Unified Modality Matching (UMM) and Cross Modality Alignment (CMA)<n> Experiments on both synthetic and real-world multi-modal benchmarks demonstrate the effectiveness of our method.
arXiv Detail & Related papers (2025-06-04T08:04:58Z)
Mixture of Attention Yields Accurate Results for Tabular Data [21.410818837489973]
We propose MAYA, an encoder-decoder transformer-based framework. In the encoder, we design a Mixture of Attention (MOA) that constructs multiple parallel attention branches. We employ collaborative learning with a dynamic consistency weight constraint to produce more robust representations.
arXiv Detail & Related papers (2025-02-18T03:43:42Z)
MCSFF: Multi-modal Consistency and Specificity Fusion Framework for Entity Alignment [7.109735168520378]
Multi-modal entity alignment (MMEA) is essential for enhancing knowledge graphs and improving question-answering systems. Existing methods often focus on integrating modalities through their complementarity but overlook the specificity of each modality. We propose the Multi-modal Consistency and Specificity Fusion Framework (MCSFF), which innovatively integrates both complementary and specific aspects of modalities.
arXiv Detail & Related papers (2024-10-18T16:35:25Z)
IBMEA: Exploring Variational Information Bottleneck for Multi-modal Entity Alignment [17.570243718626994]
Multi-modal entity alignment (MMEA) aims to identify equivalent entities between multi-modal knowledge graphs (MMKGs) We devise multi-modal variational encoders to generate modal-specific entity representations as probability distributions. We also propose four modal-specific information bottleneck regularizers, limiting the misleading clues in refining modal-specific entity representations.
arXiv Detail & Related papers (2024-07-27T17:12:37Z)
Dual-Decoupling Learning and Metric-Adaptive Thresholding for Semi-Supervised Multi-Label Learning [81.83013974171364]
Semi-supervised multi-label learning (SSMLL) is a powerful framework for leveraging unlabeled data to reduce the expensive cost of collecting precise multi-label annotations. Unlike semi-supervised learning, one cannot select the most probable label as the pseudo-label in SSMLL due to multiple semantics contained in an instance. We propose a dual-perspective method to generate high-quality pseudo-labels.
arXiv Detail & Related papers (2024-07-26T09:33:53Z)
Tokenization, Fusion, and Augmentation: Towards Fine-grained Multi-modal Entity Representation [51.80447197290866]
Multi-modal knowledge graph completion (MMKGC) aims to discover unobserved knowledge from given knowledge graphs. Existing MMKGC methods usually extract multi-modal features with pre-trained models. We introduce a novel framework MyGO to tokenize, fuse, and augment the fine-grained multi-modal representations of entities.
arXiv Detail & Related papers (2024-04-15T05:40:41Z)
NativE: Multi-modal Knowledge Graph Completion in the Wild [51.80447197290866]
We propose a comprehensive framework NativE to achieve MMKGC in the wild. NativE proposes a relation-guided dual adaptive fusion module that enables adaptive fusion for any modalities. We construct a new benchmark called WildKGC with five datasets to evaluate our method.
arXiv Detail & Related papers (2024-03-28T03:04:00Z)
Joint Multimodal Transformer for Emotion Recognition in the Wild [49.735299182004404]
Multimodal emotion recognition (MMER) systems typically outperform unimodal systems. This paper proposes an MMER method that relies on a joint multimodal transformer (JMT) for fusion with key-based cross-attention.
arXiv Detail & Related papers (2024-03-15T17:23:38Z)
Exploiting Modality-Specific Features For Multi-Modal Manipulation Detection And Grounding [54.49214267905562]
We construct a transformer-based framework for multi-modal manipulation detection and grounding tasks. Our framework simultaneously explores modality-specific features while preserving the capability for multi-modal alignment. We propose an implicit manipulation query (IMQ) that adaptively aggregates global contextual cues within each modality.
arXiv Detail & Related papers (2023-09-22T06:55:41Z)
Correlation-Aware Mutual Learning for Semi-supervised Medical Image Segmentation [5.045813144375637]
Most existing semi-supervised segmentation methods only focus on extracting information from unlabeled data. We propose a novel Correlation Aware Mutual Learning framework that leverages labeled data to guide the extraction of information from unlabeled data. Our approach is based on a mutual learning strategy that incorporates two modules: the Cross-sample Mutual Attention Module (CMA) and the Omni-Correlation Consistency Module (OCC)
arXiv Detail & Related papers (2023-07-12T17:20:05Z)
Multimodal Learning Without Labeled Multimodal Data: Guarantees and Applications [90.6849884683226]
We study the challenge of interaction quantification in a semi-supervised setting with only labeled unimodal data. Using a precise information-theoretic definition of interactions, our key contribution is the derivation of lower and upper bounds. We show how these theoretical results can be used to estimate multimodal model performance, guide data collection, and select appropriate multimodal models for various tasks.
arXiv Detail & Related papers (2023-06-07T15:44:53Z)
MEAformer: Multi-modal Entity Alignment Transformer for Meta Modality Hybrid [40.745848169903105]
Multi-modal entity alignment (MMEA) aims to discover identical entities across different knowledge graphs. MMEA algorithms rely on KG-level modality fusion strategies for multi-modal entity representation. This paper introduces MEAformer, a multi-modal entity alignment transformer approach for meta modality hybrid.
arXiv Detail & Related papers (2022-12-29T20:49:58Z)
Bi-Bimodal Modality Fusion for Correlation-Controlled Multimodal Sentiment Analysis [96.46952672172021]
Bi-Bimodal Fusion Network (BBFN) is a novel end-to-end network that performs fusion on pairwise modality representations. Model takes two bimodal pairs as input due to known information imbalance among modalities.
arXiv Detail & Related papers (2021-07-28T23:33:42Z)
Self-Supervised Multimodal Domino: in Search of Biomarkers for Alzheimer's Disease [19.86082635340699]
We propose a taxonomy of all reasonable ways to organize self-supervised representation-learning algorithms. We first evaluate models on toy multimodal MNIST datasets and then apply them to a multimodal neuroimaging dataset with Alzheimer's disease patients. Results show that the proposed approach outperforms previous self-supervised encoder-decoder methods.
arXiv Detail & Related papers (2020-12-25T20:28:13Z)

This list is automatically generated from the titles and abstracts of the papers in this site.