Related papers: An Interpretable and Uncertainty Aware Multi-Task Framework for Multi-Aspect Sentiment Analysis

An Interpretable and Uncertainty Aware Multi-Task Framework for Multi-Aspect Sentiment Analysis

URL: http://arxiv.org/abs/2009.09112v2
Date: Mon, 31 May 2021 03:44:49 GMT
Title: An Interpretable and Uncertainty Aware Multi-Task Framework for Multi-Aspect Sentiment Analysis
Authors: Tian Shi and Ping Wang and Chandan K. Reddy
Abstract summary: Document-level Multi-aspect Sentiment Classification (DMSC) is a challenging and imminent problem. We propose a deliberate self-attention-based deep neural network model, namely FEDAR, for the DMSC problem. FEDAR can achieve competitive performance while also being able to interpret the predictions made.
Score: 15.755185152760083
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In recent years, several online platforms have seen a rapid increase in the number of review systems that request users to provide aspect-level feedback. Document-level Multi-aspect Sentiment Classification (DMSC), where the goal is to predict the ratings/sentiment from a review at an individual aspect level, has become a challenging and imminent problem. To tackle this challenge, we propose a deliberate self-attention-based deep neural network model, namely FEDAR, for the DMSC problem, which can achieve competitive performance while also being able to interpret the predictions made. FEDAR is equipped with a highway word embedding layer to transfer knowledge from pre-trained word embeddings, an RNN encoder layer with output features enriched by pooling and factorization techniques, and a deliberate self-attention layer. In addition, we also propose an Attention-driven Keywords Ranking (AKR) method, which can automatically discover aspect keywords and aspect-level opinion keywords from the review corpus based on the attention weights. These keywords are significant for rating predictions by FEDAR. Since crowdsourcing annotation can be an alternate way to recover missing ratings of reviews, we propose a LEcture-AuDience (LEAD) strategy to estimate model uncertainty in the context of multi-task learning, so that valuable human resources can focus on the most uncertain predictions. Our extensive set of experiments on five different open-domain DMSC datasets demonstrate the superiority of the proposed FEDAR and LEAD models. We further introduce two new DMSC datasets in the healthcare domain and benchmark different baseline models and our models on them. Attention weights visualization results and visualization of aspect and opinion keywords demonstrate the interpretability of our model and the effectiveness of our AKR method.

Related papers

Envisioning Beyond the Pixels: Benchmarking Reasoning-Informed Visual Editing [90.65399476233495]
We introduce RISEBench, the first benchmark for evaluating Reasoning-Informed viSual Editing (RISE) RISEBench focuses on four key reasoning types: Temporal, Causal, Spatial, and Logical Reasoning. We propose an evaluation framework that assesses Instruction Reasoning, Appearance Consistency, and Visual Plausibility with both human judges and an LMM-as-a-judge approach.
arXiv Detail & Related papers (2025-04-03T17:59:56Z)
A Comprehensive Review on Hashtag Recommendation: From Traditional to Deep Learning and Beyond [0.37865171120254354]
Hashtags, as a fundamental categorization mechanism, play a pivotal role in enhancing content visibility and user engagement. The development of accurate and robust hashtag recommendation systems remains a complex and evolving research challenge. This review article conducts a systematic analysis of hashtag recommendation systems, examining recent advancements across several dimensions.
arXiv Detail & Related papers (2025-03-24T13:40:36Z)
Evaluating and Advancing Multimodal Large Language Models in Ability Lens [30.083110119139793]
We introduce textbfAbilityLens, a unified benchmark designed to evaluate MLLMs across six key perception abilities. We identify the strengths and weaknesses of current models, highlighting stability patterns and revealing a notable performance gap between open-source and closed-source models. We also design a simple ability-specific model merging method that combines the best ability checkpoint from early training stages, effectively mitigating performance decline due to ability conflict.
arXiv Detail & Related papers (2024-11-22T04:41:20Z)
MMIE: Massive Multimodal Interleaved Comprehension Benchmark for Large Vision-Language Models [71.36392373876505]
We introduce MMIE, a large-scale benchmark for evaluating interleaved multimodal comprehension and generation in Large Vision-Language Models (LVLMs) MMIE comprises 20K meticulously curated multimodal queries, spanning 3 categories, 12 fields, and 102 subfields, including mathematics, coding, physics, literature, health, and arts. It supports both interleaved inputs and outputs, offering a mix of multiple-choice and open-ended question formats to evaluate diverse competencies.
arXiv Detail & Related papers (2024-10-14T04:15:00Z)
Unsupervised Model Diagnosis [49.36194740479798]
This paper proposes Unsupervised Model Diagnosis (UMO) to produce semantic counterfactual explanations without any user guidance. Our approach identifies and visualizes changes in semantics, and then matches these changes to attributes from wide-ranging text sources.
arXiv Detail & Related papers (2024-10-08T17:59:03Z)
Multi-Modal Prompt Learning on Blind Image Quality Assessment [65.0676908930946]
Image Quality Assessment (IQA) models benefit significantly from semantic information, which allows them to treat different types of objects distinctly. Traditional methods, hindered by a lack of sufficiently annotated data, have employed the CLIP image-text pretraining model as their backbone to gain semantic awareness. Recent approaches have attempted to address this mismatch using prompt technology, but these solutions have shortcomings. This paper introduces an innovative multi-modal prompt-based methodology for IQA.
arXiv Detail & Related papers (2024-04-23T11:45:32Z)
Efficient Prompt Tuning of Large Vision-Language Model for Fine-Grained Ship Classification [62.425462136772666]
Fine-grained ship classification in remote sensing (RS-FGSC) poses a significant challenge due to the high similarity between classes and the limited availability of labeled data. Recent advancements in large pre-trained Vision-Language Models (VLMs) have demonstrated impressive capabilities in few-shot or zero-shot learning. This study delves into harnessing the potential of VLMs to enhance classification accuracy for unseen ship categories.
arXiv Detail & Related papers (2024-03-13T05:48:58Z)
Debiasing Multimodal Large Language Models [61.6896704217147]
Large Vision-Language Models (LVLMs) have become indispensable tools in computer vision and natural language processing. Our investigation reveals a noteworthy bias in the generated content, where the output is primarily influenced by the underlying Large Language Models (LLMs) prior to the input image. To rectify these biases and redirect the model's focus toward vision information, we introduce two simple, training-free strategies.
arXiv Detail & Related papers (2024-03-08T12:35:07Z)
HGOT: Hierarchical Graph of Thoughts for Retrieval-Augmented In-Context Learning in Factuality Evaluation [20.178644251662316]
We introduce the hierarchical graph of thoughts (HGOT) to enhance the retrieval of pertinent passages during in-context learning. The framework employs the divide-and-conquer strategy to break down complex queries into manageable sub-queries. It refines self-consistency majority voting for answer selection, which incorporates the recently proposed citation recall and precision metrics.
arXiv Detail & Related papers (2024-02-14T18:41:19Z)
Understanding Before Recommendation: Semantic Aspect-Aware Review Exploitation via Large Language Models [53.337728969143086]
Recommendation systems harness user-item interactions like clicks and reviews to learn their representations. Previous studies improve recommendation accuracy and interpretability by modeling user preferences across various aspects and intents. We introduce a chain-based prompting approach to uncover semantic aspect-aware interactions.
arXiv Detail & Related papers (2023-12-26T15:44:09Z)
Don't Be So Sure! Boosting ASR Decoding via Confidence Relaxation [7.056222499095849]
beam search seeks the transcript with the greatest likelihood computed using the predicted distribution. We show that recently proposed Self-Supervised Learning (SSL)-based ASR models tend to yield exceptionally confident predictions. We propose a decoding procedure that improves the performance of fine-tuned ASR models.
arXiv Detail & Related papers (2022-12-27T06:42:26Z)

This list is automatically generated from the titles and abstracts of the papers in this site.