Related papers: Integrating Large Pre-trained Models into Multimodal Named Entity Recognition with Evidential Fusion

Integrating Large Pre-trained Models into Multimodal Named Entity Recognition with Evidential Fusion

URL: http://arxiv.org/abs/2306.16991v1
Date: Thu, 29 Jun 2023 14:50:23 GMT
Title: Integrating Large Pre-trained Models into Multimodal Named Entity Recognition with Evidential Fusion
Authors: Weide Liu, Xiaoyang Zhong, Jingwen Hou, Shaohua Li, Haozhe Huang and Yuming Fang
Abstract summary: We propose incorporating uncertainty estimation into the MNER task, producing trustworthy predictions. Our proposed algorithm models the distribution of each modality as a Normal-inverse Gamma distribution, and fuses them into a unified distribution. Experiments on two datasets demonstrate that our proposed method outperforms the baselines and achieves new state-of-the-art performance.
Score: 31.234455370113075
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Multimodal Named Entity Recognition (MNER) is a crucial task for information extraction from social media platforms such as Twitter. Most current methods rely on attention weights to extract information from both text and images but are often unreliable and lack interpretability. To address this problem, we propose incorporating uncertainty estimation into the MNER task, producing trustworthy predictions. Our proposed algorithm models the distribution of each modality as a Normal-inverse Gamma distribution, and fuses them into a unified distribution with an evidential fusion mechanism, enabling hierarchical characterization of uncertainties and promotion of prediction accuracy and trustworthiness. Additionally, we explore the potential of pre-trained large foundation models in MNER and propose an efficient fusion approach that leverages their robust feature representations. Experiments on two datasets demonstrate that our proposed method outperforms the baselines and achieves new state-of-the-art performance.

Related papers

Certainly Bot Or Not? Trustworthy Social Bot Detection via Robust Multi-Modal Neural Processes [28.951832771823128]
Social bot detection is crucial for mitigating misinformation, online manipulation, and coordinated inauthentic behavior. Existing neural network-based detectors struggle with generalization due to distribution shifts across datasets. We introduce a novel Uncertainty Estimation for Social Bot Detection framework, which quantifies the predictive uncertainty of detectors beyond mere classification.
arXiv Detail & Related papers (2025-03-11T01:32:52Z)
Latent Distribution Decoupling: A Probabilistic Framework for Uncertainty-Aware Multimodal Emotion Recognition [7.25361375272096]
Multimodal multi-label emotion recognition aims to identify the concurrent presence of multiple emotions in multimodal data. Existing studies overlook the impact of textbfaleatoric uncertainty, which is the inherent noise in the multimodal data. This paper proposes Latent emotional Distribution Decomposition with Uncertainty perception framework.
arXiv Detail & Related papers (2025-02-19T18:53:23Z)
Uncertainty Quantification via Hölder Divergence for Multi-View Representation Learning [18.419742575630217]
This paper introduces a novel algorithm based on H"older Divergence (HD) to enhance the reliability of multi-view learning. Through the Dempster-Shafer theory, integration of uncertainty from different modalities, thereby generating a comprehensive result. Mathematically, HD proves to better measure the distance'' between real data distribution and predictive distribution of the model.
arXiv Detail & Related papers (2024-10-29T04:29:44Z)
Efficient Distribution Matching of Representations via Noise-Injected Deep InfoMax [73.03684002513218]
We enhance Deep InfoMax (DIM) to enable automatic matching of learned representations to a selected prior distribution. We show that such modification allows for learning uniformly and normally distributed representations. The results indicate a moderate trade-off between the performance on the downstream tasks and quality of DM.
arXiv Detail & Related papers (2024-10-09T15:40:04Z)
Confidence-aware multi-modality learning for eye disease screening [58.861421804458395]
We propose a novel multi-modality evidential fusion pipeline for eye disease screening. It provides a measure of confidence for each modality and elegantly integrates the multi-modality information. Experimental results on both public and internal datasets demonstrate that our model excels in robustness.
arXiv Detail & Related papers (2024-05-28T13:27:30Z)
LoRA-Ensemble: Efficient Uncertainty Modelling for Self-attention Networks [52.46420522934253]
We introduce LoRA-Ensemble, a parameter-efficient deep ensemble method for self-attention networks. By employing a single pre-trained self-attention network with weights shared across all members, we train member-specific low-rank matrices for the attention projections. Our method exhibits superior calibration compared to explicit ensembles and achieves similar or better accuracy across various prediction tasks and datasets.
arXiv Detail & Related papers (2024-05-23T11:10:32Z)
Ensemble Modeling for Multimodal Visual Action Recognition [50.38638300332429]
We propose an ensemble modeling approach for multimodal action recognition. We independently train individual modality models using a variant of focal loss tailored to handle the long-tailed distribution of the MECCANO [21] dataset.
arXiv Detail & Related papers (2023-08-10T08:43:20Z)
Learning Against Distributional Uncertainty: On the Trade-off Between Robustness and Specificity [24.874664446700272]
This paper studies a new framework that unifies the three approaches and that addresses the two challenges mentioned above. The properties (e.g., consistency and normalities), non-asymptotic properties (e.g., unbiasedness and error bound), and a Monte-Carlo-based solution method of the proposed model are studied.
arXiv Detail & Related papers (2023-01-31T11:33:18Z)
Transformer Uncertainty Estimation with Hierarchical Stochastic Attention [8.95459272947319]
We propose a novel way to enable transformers to have the capability of uncertainty estimation. This is achieved by learning a hierarchical self-attention that attends to values and a set of learnable centroids. We empirically evaluate our model on two text classification tasks with both in-domain (ID) and out-of-domain (OOD) datasets.
arXiv Detail & Related papers (2021-12-27T16:43:31Z)
Trustworthy Multimodal Regression with Mixture of Normal-inverse Gamma Distributions [91.63716984911278]
We introduce a novel Mixture of Normal-Inverse Gamma distributions (MoNIG) algorithm, which efficiently estimates uncertainty in principle for adaptive integration of different modalities and produces a trustworthy regression result. Experimental results on both synthetic and different real-world data demonstrate the effectiveness and trustworthiness of our method on various multimodal regression tasks.
arXiv Detail & Related papers (2021-11-11T14:28:12Z)
Diversity inducing Information Bottleneck in Model Ensembles [73.80615604822435]
In this paper, we target the problem of generating effective ensembles of neural networks by encouraging diversity in prediction. We explicitly optimize a diversity inducing adversarial loss for learning latent variables and thereby obtain diversity in the output predictions necessary for modeling multi-modal data. Compared to the most competitive baselines, we show significant improvements in classification accuracy, under a shift in the data distribution.
arXiv Detail & Related papers (2020-03-10T03:10:41Z)

This list is automatically generated from the titles and abstracts of the papers in this site.