An Interpretable and Uncertainty Aware Multi-Task Framework for
Multi-Aspect Sentiment Analysis
- URL: http://arxiv.org/abs/2009.09112v2
- Date: Mon, 31 May 2021 03:44:49 GMT
- Title: An Interpretable and Uncertainty Aware Multi-Task Framework for
Multi-Aspect Sentiment Analysis
- Authors: Tian Shi and Ping Wang and Chandan K. Reddy
- Abstract summary: Document-level Multi-aspect Sentiment Classification (DMSC) is a challenging and imminent problem.
We propose a deliberate self-attention-based deep neural network model, namely FEDAR, for the DMSC problem.
FEDAR can achieve competitive performance while also being able to interpret the predictions made.
- Score: 15.755185152760083
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In recent years, several online platforms have seen a rapid increase in the
number of review systems that request users to provide aspect-level feedback.
Document-level Multi-aspect Sentiment Classification (DMSC), where the goal is
to predict the ratings/sentiment from a review at an individual aspect level,
has become a challenging and imminent problem. To tackle this challenge, we
propose a deliberate self-attention-based deep neural network model, namely
FEDAR, for the DMSC problem, which can achieve competitive performance while
also being able to interpret the predictions made. FEDAR is equipped with a
highway word embedding layer to transfer knowledge from pre-trained word
embeddings, an RNN encoder layer with output features enriched by pooling and
factorization techniques, and a deliberate self-attention layer. In addition,
we also propose an Attention-driven Keywords Ranking (AKR) method, which can
automatically discover aspect keywords and aspect-level opinion keywords from
the review corpus based on the attention weights. These keywords are
significant for rating predictions by FEDAR. Since crowdsourcing annotation can
be an alternate way to recover missing ratings of reviews, we propose a
LEcture-AuDience (LEAD) strategy to estimate model uncertainty in the context
of multi-task learning, so that valuable human resources can focus on the most
uncertain predictions. Our extensive set of experiments on five different
open-domain DMSC datasets demonstrate the superiority of the proposed FEDAR and
LEAD models. We further introduce two new DMSC datasets in the healthcare
domain and benchmark different baseline models and our models on them.
Attention weights visualization results and visualization of aspect and opinion
keywords demonstrate the interpretability of our model and the effectiveness of
our AKR method.
Related papers
- MMIE: Massive Multimodal Interleaved Comprehension Benchmark for Large Vision-Language Models [71.36392373876505]
We introduce MMIE, a large-scale benchmark for evaluating interleaved multimodal comprehension and generation in Large Vision-Language Models (LVLMs)
MMIE comprises 20K meticulously curated multimodal queries, spanning 3 categories, 12 fields, and 102 subfields, including mathematics, coding, physics, literature, health, and arts.
It supports both interleaved inputs and outputs, offering a mix of multiple-choice and open-ended question formats to evaluate diverse competencies.
arXiv Detail & Related papers (2024-10-14T04:15:00Z) - Unsupervised Model Diagnosis [49.36194740479798]
This paper proposes Unsupervised Model Diagnosis (UMO) to produce semantic counterfactual explanations without any user guidance.
Our approach identifies and visualizes changes in semantics, and then matches these changes to attributes from wide-ranging text sources.
arXiv Detail & Related papers (2024-10-08T17:59:03Z) - MR-Ben: A Meta-Reasoning Benchmark for Evaluating System-2 Thinking in LLMs [55.20845457594977]
Large language models (LLMs) have shown increasing capability in problem-solving and decision-making.
We present a process-based benchmark MR-Ben that demands a meta-reasoning skill.
Our meta-reasoning paradigm is especially suited for system-2 slow thinking.
arXiv Detail & Related papers (2024-06-20T03:50:23Z) - Multi-Modal Prompt Learning on Blind Image Quality Assessment [65.0676908930946]
Image Quality Assessment (IQA) models benefit significantly from semantic information, which allows them to treat different types of objects distinctly.
Traditional methods, hindered by a lack of sufficiently annotated data, have employed the CLIP image-text pretraining model as their backbone to gain semantic awareness.
Recent approaches have attempted to address this mismatch using prompt technology, but these solutions have shortcomings.
This paper introduces an innovative multi-modal prompt-based methodology for IQA.
arXiv Detail & Related papers (2024-04-23T11:45:32Z) - Efficient Prompt Tuning of Large Vision-Language Model for Fine-Grained
Ship Classification [62.425462136772666]
Fine-grained ship classification in remote sensing (RS-FGSC) poses a significant challenge due to the high similarity between classes and the limited availability of labeled data.
Recent advancements in large pre-trained Vision-Language Models (VLMs) have demonstrated impressive capabilities in few-shot or zero-shot learning.
This study delves into harnessing the potential of VLMs to enhance classification accuracy for unseen ship categories.
arXiv Detail & Related papers (2024-03-13T05:48:58Z) - Debiasing Multimodal Large Language Models [61.6896704217147]
Large Vision-Language Models (LVLMs) have become indispensable tools in computer vision and natural language processing.
Our investigation reveals a noteworthy bias in the generated content, where the output is primarily influenced by the underlying Large Language Models (LLMs) prior to the input image.
To rectify these biases and redirect the model's focus toward vision information, we introduce two simple, training-free strategies.
arXiv Detail & Related papers (2024-03-08T12:35:07Z) - HGOT: Hierarchical Graph of Thoughts for Retrieval-Augmented In-Context Learning in Factuality Evaluation [20.178644251662316]
We introduce the hierarchical graph of thoughts (HGOT) to enhance the retrieval of pertinent passages during in-context learning.
The framework employs the divide-and-conquer strategy to break down complex queries into manageable sub-queries.
It refines self-consistency majority voting for answer selection, which incorporates the recently proposed citation recall and precision metrics.
arXiv Detail & Related papers (2024-02-14T18:41:19Z) - Don't Be So Sure! Boosting ASR Decoding via Confidence Relaxation [7.056222499095849]
beam search seeks the transcript with the greatest likelihood computed using the predicted distribution.
We show that recently proposed Self-Supervised Learning (SSL)-based ASR models tend to yield exceptionally confident predictions.
We propose a decoding procedure that improves the performance of fine-tuned ASR models.
arXiv Detail & Related papers (2022-12-27T06:42:26Z) - Review of coreference resolution in English and Persian [8.604145658574689]
Coreference resolution (CR) identifies expressions referring to the same real-world entity.
This paper explores the latest advancements in CR, spanning coreference and anaphora resolution.
Recognizing the unique challenges of Persian CR, we dedicate a focused analysis to this under-resourced language.
arXiv Detail & Related papers (2022-11-08T18:14:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.