MARE: Multi-Aspect Rationale Extractor on Unsupervised Rationale Extraction
- URL: http://arxiv.org/abs/2410.03531v1
- Date: Fri, 4 Oct 2024 15:52:29 GMT
- Title: MARE: Multi-Aspect Rationale Extractor on Unsupervised Rationale Extraction
- Authors: Han Jiang, Junwen Duan, Zhe Qu, Jianxin Wang,
- Abstract summary: Unsupervised rationale extraction aims to extract text snippets to support model predictions without explicit rationale annotation.
Previous works often encode each aspect independently, which may limit their ability to capture meaningful internal correlations between aspects.
We propose a Multi-Aspect Rationale Extractor (MARE) to explain and predict multiple aspects simultaneously.
- Score: 10.998983921416533
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Unsupervised rationale extraction aims to extract text snippets to support model predictions without explicit rationale annotation. Researchers have made many efforts to solve this task. Previous works often encode each aspect independently, which may limit their ability to capture meaningful internal correlations between aspects. While there has been significant work on mitigating spurious correlations, our approach focuses on leveraging the beneficial internal correlations to improve multi-aspect rationale extraction. In this paper, we propose a Multi-Aspect Rationale Extractor (MARE) to explain and predict multiple aspects simultaneously. Concretely, we propose a Multi-Aspect Multi-Head Attention (MAMHA) mechanism based on hard deletion to encode multiple text chunks simultaneously. Furthermore, multiple special tokens are prepended in front of the text with each corresponding to one certain aspect. Finally, multi-task training is deployed to reduce the training overhead. Experimental results on two unsupervised rationale extraction benchmarks show that MARE achieves state-of-the-art performance. Ablation studies further demonstrate the effectiveness of our method. Our codes have been available at https://github.com/CSU-NLP-Group/MARE.
Related papers
- Mutual Distillation Learning For Person Re-Identification [27.350415735863184]
We propose a novel approach, Mutual Distillation Learning For Person Re-identification (termed as MDPR)
Our approach encompasses two branches: a hard content branch to extract local features via a uniform horizontal partitioning strategy and a Soft Content Branch to dynamically distinguish between foreground and background.
Our method achieves an impressive $88.7%/94.4%$ in mAP/Rank-1 on the DukeC-reID dataset, surpassing the current state-of-the-art results.
arXiv Detail & Related papers (2024-01-12T07:49:02Z) - Mitigating Shortcut Learning with Diffusion Counterfactuals and Diverse Ensembles [95.49699178874683]
We propose DiffDiv, an ensemble diversification framework exploiting Diffusion Probabilistic Models (DPMs)
We show that DPMs can generate images with novel feature combinations, even when trained on samples displaying correlated input features.
We show that DPM-guided diversification is sufficient to remove dependence on shortcut cues, without a need for additional supervised signals.
arXiv Detail & Related papers (2023-11-23T15:47:33Z) - Exploiting Modality-Specific Features For Multi-Modal Manipulation
Detection And Grounding [54.49214267905562]
We construct a transformer-based framework for multi-modal manipulation detection and grounding tasks.
Our framework simultaneously explores modality-specific features while preserving the capability for multi-modal alignment.
We propose an implicit manipulation query (IMQ) that adaptively aggregates global contextual cues within each modality.
arXiv Detail & Related papers (2023-09-22T06:55:41Z) - DIR-AS: Decoupling Individual Identification and Temporal Reasoning for
Action Segmentation [84.78383981697377]
Fully supervised action segmentation works on frame-wise action recognition with dense annotations and often suffers from the over-segmentation issue.
We develop a novel local-global attention mechanism with temporal pyramid dilation and temporal pyramid pooling for efficient multi-scale attention.
We achieve state-of-the-art accuracy, eg, 82.8% (+2.6%) on GTEA and 74.7% (+1.2%) on Breakfast, which demonstrates the effectiveness of our proposed method.
arXiv Detail & Related papers (2023-04-04T20:27:18Z) - Program Generation from Diverse Video Demonstrations [49.202289347899836]
Generalising over multiple observations is a task that has historically presented difficulties for machines to grasp.
We propose a model that can extract general rules from video demonstrations by simultaneously performing summarisation and translation.
arXiv Detail & Related papers (2023-02-01T01:51:45Z) - Interlock-Free Multi-Aspect Rationalization for Text Classification [33.33452117387646]
We show that we address the interlocking problem in the multi-aspect setting.
We propose a multi-stage training method incorporating an additional self-supervised contrastive loss.
Empirical results on the beer review dataset show that our method improves significantly the rationalization performance.
arXiv Detail & Related papers (2022-05-13T16:38:38Z) - Improving Multi-Document Summarization through Referenced Flexible
Extraction with Credit-Awareness [21.037841262371355]
A notable challenge in Multi-Document Summarization (MDS) is the extremely-long length of the input.
We present an extract-then-abstract Transformer framework to overcome the problem.
We propose a loss weighting mechanism that makes the model aware of the unequal importance for the sentences not in the pseudo extraction oracle.
arXiv Detail & Related papers (2022-05-04T04:40:39Z) - MIST: Multiple Instance Self-Training Framework for Video Anomaly
Detection [76.80153360498797]
We develop a multiple instance self-training framework (MIST) to efficiently refine task-specific discriminative representations.
MIST is composed of 1) a multiple instance pseudo label generator, which adapts a sparse continuous sampling strategy to produce more reliable clip-level pseudo labels, and 2) a self-guided attention boosted feature encoder.
Our method performs comparably to or even better than existing supervised and weakly supervised methods, specifically obtaining a frame-level AUC 94.83% on ShanghaiTech.
arXiv Detail & Related papers (2021-04-04T15:47:14Z) - Unsupervised Neural Aspect Search with Related Terms Extraction [0.3670422696827526]
We present a novel unsupervised neural network with convolutional multi-attention mechanism, that allows extracting pairs (aspect, term) simultaneously.
We apply a special loss aimed to improve the quality of multi-aspect extraction.
arXiv Detail & Related papers (2020-05-06T12:39:45Z) - Multi-Task Learning for Dense Prediction Tasks: A Survey [87.66280582034838]
Multi-task learning (MTL) techniques have shown promising results w.r.t. performance, computations and/or memory footprint.
We provide a well-rounded view on state-of-the-art deep learning approaches for MTL in computer vision.
arXiv Detail & Related papers (2020-04-28T09:15:50Z) - Interpretable Multi-Headed Attention for Abstractive Summarization at
Controllable Lengths [14.762731718325002]
Multi-level Summarizer (MLS) is a supervised method to construct abstractive summaries of a text document at controllable lengths.
MLS outperforms strong baselines by up to 14.70% in the METEOR score.
arXiv Detail & Related papers (2020-02-18T19:40:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.