Contrastive variational information bottleneck for aspect-based
sentiment analysis
- URL: http://arxiv.org/abs/2303.02846v3
- Date: Thu, 21 Dec 2023 07:35:18 GMT
- Title: Contrastive variational information bottleneck for aspect-based
sentiment analysis
- Authors: Mingshan Chang, Min Yang, Qingshan Jiang, and Ruifeng Xu
- Abstract summary: We propose to reduce spurious correlations for aspect-based sentiment analysis (ABSA) via a novel Contrastive Variational Information Bottleneck framework (called CVIB)
The proposed CVIB framework is composed of an original network and a self-pruned network, and these two networks are optimized simultaneously via contrastive learning.
Our approach achieves better performance than the strong competitors in terms of overall prediction performance, robustness, and generalization.
- Score: 36.83876224466177
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Deep learning techniques have dominated the literature on aspect-based
sentiment analysis (ABSA), achieving state-of-the-art performance. However,
deep models generally suffer from spurious correlations between input features
and output labels, which hurts the robustness and generalization capability by
a large margin. In this paper, we propose to reduce spurious correlations for
ABSA, via a novel Contrastive Variational Information Bottleneck framework
(called CVIB). The proposed CVIB framework is composed of an original network
and a self-pruned network, and these two networks are optimized simultaneously
via contrastive learning. Concretely, we employ the Variational Information
Bottleneck (VIB) principle to learn an informative and compressed network
(self-pruned network) from the original network, which discards the superfluous
patterns or spurious correlations between input features and prediction labels.
Then, self-pruning contrastive learning is devised to pull together
semantically similar positive pairs and push away dissimilar pairs, where the
representations of the anchor learned by the original and self-pruned networks
respectively are regarded as a positive pair while the representations of two
different sentences within a mini-batch are treated as a negative pair. To
verify the effectiveness of our CVIB method, we conduct extensive experiments
on five benchmark ABSA datasets and the experimental results show that our
approach achieves better performance than the strong competitors in terms of
overall prediction performance, robustness, and generalization. Code and data
to reproduce the results in this paper is available at:
https://github.com/shesshan/CVIB.
Related papers
- Improving Network Interpretability via Explanation Consistency Evaluation [56.14036428778861]
We propose a framework that acquires more explainable activation heatmaps and simultaneously increase the model performance.
Specifically, our framework introduces a new metric, i.e., explanation consistency, to reweight the training samples adaptively in model learning.
Our framework then promotes the model learning by paying closer attention to those training samples with a high difference in explanations.
arXiv Detail & Related papers (2024-08-08T17:20:08Z) - DIVE: Subgraph Disagreement for Graph Out-of-Distribution Generalization [44.291382840373]
This paper addresses the challenge of out-of-distribution generalization in graph machine learning.
Traditional graph learning algorithms falter in real-world scenarios where this assumption fails.
A principal factor contributing to this suboptimal performance is the inherent simplicity bias of neural networks.
arXiv Detail & Related papers (2024-08-08T12:08:55Z) - Deep Boosting Learning: A Brand-new Cooperative Approach for Image-Text Matching [53.05954114863596]
We propose a brand-new Deep Boosting Learning (DBL) algorithm for image-text matching.
An anchor branch is first trained to provide insights into the data properties.
A target branch is concurrently tasked with more adaptive margin constraints to further enlarge the relative distance between matched and unmatched samples.
arXiv Detail & Related papers (2024-04-28T08:44:28Z) - Cell Variational Information Bottleneck Network [6.164295534465283]
We propose a convolutional neural network using information bottleneck mechanism, which can be combined with the latest feedforward network architecture.
Cell Variational Information Bottleneck Network is constructed by stacking VIB cells, which generate feature maps with uncertainty.
In a more complex representation learning task, face recognition, our network structure has also achieved very competitive results.
arXiv Detail & Related papers (2024-03-22T10:06:31Z) - Noisy Correspondence Learning with Self-Reinforcing Errors Mitigation [63.180725016463974]
Cross-modal retrieval relies on well-matched large-scale datasets that are laborious in practice.
We introduce a novel noisy correspondence learning framework, namely textbfSelf-textbfReinforcing textbfErrors textbfMitigation (SREM)
arXiv Detail & Related papers (2023-12-27T09:03:43Z) - Regularization Through Simultaneous Learning: A Case Study on Plant
Classification [0.0]
This paper introduces Simultaneous Learning, a regularization approach drawing on principles of Transfer Learning and Multi-task Learning.
We leverage auxiliary datasets with the target dataset, the UFOP-HVD, to facilitate simultaneous classification guided by a customized loss function.
Remarkably, our approach demonstrates superior performance over models without regularization.
arXiv Detail & Related papers (2023-05-22T19:44:57Z) - Energy-based Out-of-Distribution Detection for Graph Neural Networks [76.0242218180483]
We propose a simple, powerful and efficient OOD detection model for GNN-based learning on graphs, which we call GNNSafe.
GNNSafe achieves up to $17.0%$ AUROC improvement over state-of-the-arts and it could serve as simple yet strong baselines in such an under-developed area.
arXiv Detail & Related papers (2023-02-06T16:38:43Z) - Deep Stable Learning for Out-Of-Distribution Generalization [27.437046504902938]
Approaches based on deep neural networks have achieved striking performance when testing data and training data share similar distribution.
Eliminating the impact of distribution shifts between training and testing data is crucial for building performance-promising deep models.
We propose to address this problem by removing the dependencies between features via learning weights for training samples.
arXiv Detail & Related papers (2021-04-16T03:54:21Z) - Anomaly Detection on Attributed Networks via Contrastive Self-Supervised
Learning [50.24174211654775]
We present a novel contrastive self-supervised learning framework for anomaly detection on attributed networks.
Our framework fully exploits the local information from network data by sampling a novel type of contrastive instance pair.
A graph neural network-based contrastive learning model is proposed to learn informative embedding from high-dimensional attributes and local structure.
arXiv Detail & Related papers (2021-02-27T03:17:20Z) - Pairwise Supervised Hashing with Bernoulli Variational Auto-Encoder and
Self-Control Gradient Estimator [62.26981903551382]
Variational auto-encoders (VAEs) with binary latent variables provide state-of-the-art performance in terms of precision for document retrieval.
We propose a pairwise loss function with discrete latent VAE to reward within-class similarity and between-class dissimilarity for supervised hashing.
This new semantic hashing framework achieves superior performance compared to the state-of-the-arts.
arXiv Detail & Related papers (2020-05-21T06:11:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.