M-ABSA: A Multilingual Dataset for Aspect-Based Sentiment Analysis
- URL: http://arxiv.org/abs/2502.11824v1
- Date: Mon, 17 Feb 2025 14:16:01 GMT
- Title: M-ABSA: A Multilingual Dataset for Aspect-Based Sentiment Analysis
- Authors: Chengyan Wu, Bolei Ma, Yihong Liu, Zheyu Zhang, Ningyuan Deng, Yanshu Li, Baolan Chen, Yi Zhang, Barbara Plank, Yun Xue,
- Abstract summary: We present M-ABSA, a comprehensive dataset spanning 7 domains and 21 languages.
Our primary focus is on triplet extraction, which involves identifying aspect terms, aspect categories, and sentiment polarities.
Our empirical findings highlight that the dataset enables diverse evaluation tasks, such as multilingual and multi-domain transfer learning.
- Score: 23.523947343171926
- License:
- Abstract: Aspect-based sentiment analysis (ABSA) is a crucial task in information extraction and sentiment analysis, aiming to identify aspects with associated sentiment elements in text. However, existing ABSA datasets are predominantly English-centric, limiting the scope for multilingual evaluation and research. To bridge this gap, we present M-ABSA, a comprehensive dataset spanning 7 domains and 21 languages, making it the most extensive multilingual parallel dataset for ABSA to date. Our primary focus is on triplet extraction, which involves identifying aspect terms, aspect categories, and sentiment polarities. The dataset is constructed through an automatic translation process with human review to ensure quality. We perform extensive experiments using various baselines to assess performance and compatibility on M-ABSA. Our empirical findings highlight that the dataset enables diverse evaluation tasks, such as multilingual and multi-domain transfer learning, and large language model evaluation, underscoring its inclusivity and its potential to drive advancements in multilingual ABSA research.
Related papers
- BRIGHTER: BRIdging the Gap in Human-Annotated Textual Emotion Recognition Datasets for 28 Languages [93.92804151830744]
We present BRIGHTER, a collection of emotion-annotated datasets in 28 different languages.
We describe the data collection and annotation processes and the challenges of building these datasets.
We show that BRIGHTER datasets are a step towards bridging the gap in text-based emotion recognition.
arXiv Detail & Related papers (2025-02-17T15:39:50Z) - ROAST: Review-level Opinion Aspect Sentiment Target Joint Detection for ABSA [50.90538760832107]
This research presents a novel task, Review-Level Opinion Aspect Sentiment Target (ROAST)
ROAST seeks to close the gap between sentence-level and text-level ABSA by identifying every ABSA constituent at the review level.
We extend the available datasets to enable ROAST, addressing the drawbacks noted in previous research.
arXiv Detail & Related papers (2024-05-30T17:29:15Z) - Multi-EuP: The Multilingual European Parliament Dataset for Analysis of
Bias in Information Retrieval [62.82448161570428]
This dataset is designed to investigate fairness in a multilingual information retrieval context.
It boasts an authentic multilingual corpus, featuring topics translated into all 24 languages.
It offers rich demographic information associated with its documents, facilitating the study of demographic bias.
arXiv Detail & Related papers (2023-11-03T12:29:11Z) - MEMD-ABSA: A Multi-Element Multi-Domain Dataset for Aspect-Based
Sentiment Analysis [23.959356414518957]
We propose a large-scale Multi-Element Multi-Domain dataset (MEMD) that covers the four elements across five domains.
We evaluate generative and non-generative baselines on multiple ABSA subtasks under the open domain setting.
arXiv Detail & Related papers (2023-06-29T14:03:49Z) - Survey of Aspect-based Sentiment Analysis Datasets [55.61047894397937]
Aspect-based sentiment analysis (ABSA) is a natural language processing problem that requires analyzing user-generated reviews.
Numerous yet scattered corpora for ABSA make it difficult for researchers to identify corpora best suited for a specific ABSA subtask quickly.
This study aims to present a database of corpora that can be used to train and assess autonomous ABSA systems.
arXiv Detail & Related papers (2022-04-11T16:23:36Z) - CL-XABSA: Contrastive Learning for Cross-lingual Aspect-based Sentiment
Analysis [4.60495447017298]
We propose a novel framework, CL-XABSA: Contrastive Learning for Cross-lingual Aspect-Based Sentiment Analysis.
Specifically, we design two contrastive strategies, token level contrastive learning of token embeddings (TL-CTE) and sentiment level contrastive learning of token embeddings (SL-CTE)
Since our framework can receive datasets in multiple languages during training, our framework can be adapted not only for XABSA task, but also for multilingual aspect-based sentiment analysis (MABSA)
arXiv Detail & Related papers (2022-04-02T07:40:03Z) - AM2iCo: Evaluating Word Meaning in Context across Low-ResourceLanguages
with Adversarial Examples [51.048234591165155]
We present AM2iCo, Adversarial and Multilingual Meaning in Context.
It aims to faithfully assess the ability of state-of-the-art (SotA) representation models to understand the identity of word meaning in cross-lingual contexts.
Results reveal that current SotA pretrained encoders substantially lag behind human performance.
arXiv Detail & Related papers (2021-04-17T20:23:45Z) - Transformer-based Multi-Aspect Modeling for Multi-Aspect Multi-Sentiment
Analysis [56.893393134328996]
We propose a novel Transformer-based Multi-aspect Modeling scheme (TMM), which can capture potential relations between multiple aspects and simultaneously detect the sentiment of all aspects in a sentence.
Our method achieves noticeable improvements compared with strong baselines such as BERT and RoBERTa.
arXiv Detail & Related papers (2020-11-01T11:06:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.