Data Augmentation in a Hybrid Approach for Aspect-Based Sentiment
Analysis
- URL: http://arxiv.org/abs/2103.15912v1
- Date: Mon, 29 Mar 2021 19:43:15 GMT
- Title: Data Augmentation in a Hybrid Approach for Aspect-Based Sentiment
Analysis
- Authors: Tomas Liesting, Flavius Frasincar, Maria Mihaela Trusca
- Abstract summary: We investigate the effect of data augmentation on a state-of-the-art hybrid approach for aspect-based sentiment analysis (HAABSA)
We apply modified versions of easy data augmentation (EDA), backtranslation, and word mixup.
The best result is obtained with the adjusted version of EDA, which yields a 0.5 percentage point improvement on the SemEval 2016 dataset and 1 percentage point increase on the SemEval 2015 dataset.
- Score: 1.469597968606607
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Data augmentation is a way to increase the diversity of available data by
applying constrained transformations on the original data. This strategy has
been widely used in image classification but has to the best of our knowledge
not yet been used in aspect-based sentiment analysis (ABSA). ABSA is a text
analysis technique that determines aspects and their associated sentiment in
opinionated text. In this paper, we investigate the effect of data augmentation
on a state-of-the-art hybrid approach for aspect-based sentiment analysis
(HAABSA). We apply modified versions of easy data augmentation (EDA),
backtranslation, and word mixup. We evaluate the proposed techniques on the
SemEval 2015 and SemEval 2016 datasets. The best result is obtained with the
adjusted version of EDA, which yields a 0.5 percentage point improvement on the
SemEval 2016 dataset and 1 percentage point increase on the SemEval 2015
dataset compared to the original HAABSA model.
Related papers
- Unveiling the Superior Paradigm: A Comparative Study of Source-Free Domain Adaptation and Unsupervised Domain Adaptation [52.36436121884317]
We show that Source-Free Domain Adaptation (SFDA) generally outperforms Unsupervised Domain Adaptation (UDA) in real-world scenarios.
SFDA offers advantages in time efficiency, storage requirements, targeted learning objectives, reduced risk of negative transfer, and increased robustness against overfitting.
We propose a novel weight estimation method that effectively integrates available source data into multi-SFDA approaches.
arXiv Detail & Related papers (2024-11-24T13:49:29Z) - Data Augmentation for Image Classification using Generative AI [8.74488498507946]
Data augmentation is a promising solution to expanding the dataset size.
Recent approaches use generative AI models to improve dataset diversity.
We propose the Automated Generative Data Augmentation (AGA)
arXiv Detail & Related papers (2024-08-31T21:16:43Z) - Instruct-DeBERTa: A Hybrid Approach for Aspect-based Sentiment Analysis on Textual Reviews [2.0143010051030417]
Aspect-based Sentiment Analysis (ABSA) is a critical task in Natural Language Processing (NLP)
Traditional sentiment analysis methods, while useful for determining overall sentiment, often miss the implicit opinions about particular product or service features.
This paper presents a comprehensive review of the evolution of ABSA methodologies, from lexicon-based approaches to machine learning.
arXiv Detail & Related papers (2024-08-23T16:31:07Z) - Towards Robust Aspect-based Sentiment Analysis through
Non-counterfactual Augmentations [40.71705332298682]
We present an alternative approach that relies on non-counterfactual data augmentation.
Our approach further establishes a new state-of-the-art on the ABSA robustness benchmark and transfers well across domains.
arXiv Detail & Related papers (2023-06-24T13:57:32Z) - A Novel Counterfactual Data Augmentation Method for Aspect-Based
Sentiment Analysis [7.921043998643318]
We propose a novel and simple counterfactual data augmentation method to generate opinion expressions with reversed sentiment polarity.
The experimental results show the proposed counterfactual data augmentation method performs better than current augmentation methods on three ABSA datasets.
arXiv Detail & Related papers (2023-06-20T03:25:51Z) - IDA: Informed Domain Adaptive Semantic Segmentation [51.12107564372869]
We propose an Domain Informed Adaptation (IDA) model, a self-training framework that mixes the data based on class-level segmentation performance.
In our IDA model, the class-level performance is tracked by an expected confidence score (ECS) and we then use a dynamic schedule to determine the mixing ratio for data in different domains.
Our proposed method is able to outperform the state-of-the-art UDA-SS method by a margin of 1.1 mIoU in the adaptation of GTA-V to Cityscapes and of 0.9 mIoU in the adaptation of SYNTHIA to City
arXiv Detail & Related papers (2023-03-05T18:16:34Z) - Dataset Distillation via Factorization [58.8114016318593]
We introduce a emphdataset factorization approach, termed emphHaBa, which is a plug-and-play strategy portable to any existing dataset distillation (DD) baseline.
emphHaBa explores decomposing a dataset into two components: data emphHallucination networks and emphBases.
Our method can yield significant improvement on downstream classification tasks compared with previous state of the arts, while reducing the total number of compressed parameters by up to 65%.
arXiv Detail & Related papers (2022-10-30T08:36:19Z) - CAFE: Learning to Condense Dataset by Aligning Features [72.99394941348757]
We propose a novel scheme to Condense dataset by Aligning FEatures (CAFE)
At the heart of our approach is an effective strategy to align features from the real and synthetic data across various scales.
We validate the proposed CAFE across various datasets, and demonstrate that it generally outperforms the state of the art.
arXiv Detail & Related papers (2022-03-03T05:58:49Z) - LiDAR dataset distillation within bayesian active learning framework:
Understanding the effect of data augmentation [63.20765930558542]
Active learning (AL) has re-gained attention recently to address reduction of annotation costs and dataset size.
This paper performs a principled evaluation of AL based dataset distillation on (1/4th) of the large Semantic-KITTI dataset.
We observe that data augmentation achieves full dataset accuracy using only 60% of samples from the selected dataset configuration.
arXiv Detail & Related papers (2022-02-06T00:04:21Z) - CoDA: Contrast-enhanced and Diversity-promoting Data Augmentation for
Natural Language Understanding [67.61357003974153]
We propose a novel data augmentation framework dubbed CoDA.
CoDA synthesizes diverse and informative augmented examples by integrating multiple transformations organically.
A contrastive regularization objective is introduced to capture the global relationship among all the data samples.
arXiv Detail & Related papers (2020-10-16T23:57:03Z) - On the Generalization Effects of Linear Transformations in Data
Augmentation [32.01435459892255]
Data augmentation is a powerful technique to improve performance in applications such as image and text classification tasks.
We study a family of linear transformations and study their effects on the ridge estimator in an over-parametrized linear regression setting.
We propose an augmentation scheme that searches over the space of transformations by how uncertain the model is about the transformed data.
arXiv Detail & Related papers (2020-05-02T04:10:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.