MARS: Multilingual Aspect-centric Review Summarisation
- URL: http://arxiv.org/abs/2410.09991v1
- Date: Sun, 13 Oct 2024 20:16:39 GMT
- Title: MARS: Multilingual Aspect-centric Review Summarisation
- Authors: Sandeep Sricharan Mukku, Abinesh Kanagarajan, Chetan Aggarwal, Promod Yenigalla,
- Abstract summary: We propose a novel framework involving a two-step paradigm textitExtract-then-Summarise, namely MARS.
Our approach brings substantial improvements over abstractive baselines and efficiency to real-time systems.
- Score: 3.0894650827875227
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Summarizing customer feedback to provide actionable insights for products/services at scale is an important problem for businesses across industries. Lately, the review volumes are increasing across regions and languages, therefore the challenge of aggregating and understanding customer sentiment across multiple languages becomes increasingly vital. In this paper, we propose a novel framework involving a two-step paradigm \textit{Extract-then-Summarise}, namely MARS to revolutionise traditions and address the domain agnostic aspect-level multilingual review summarisation. Extensive automatic and human evaluation shows that our approach brings substantial improvements over abstractive baselines and efficiency to real-time systems.
Related papers
- Demystifying Multilingual Chain-of-Thought in Process Reward Modeling [71.12193680015622]
We tackle the challenge of extending process reward models (PRMs) to multilingual settings.
We train multilingual PRMs on a dataset spanning seven languages, which is translated from English.
Our results highlight the sensitivity of multilingual PRMs to both the number of training languages and the volume of English data.
arXiv Detail & Related papers (2025-02-18T09:11:44Z) - LFOSum: Summarizing Long-form Opinions with Large Language Models [7.839083566878183]
This paper introduces (1) a new dataset of long-form user reviews, each entity comprising over a thousand reviews, (2) two training-free LLM-based summarization approaches that scale to long inputs, and (3) automatic evaluation metrics.
Our dataset of user reviews is paired with in-depth and unbiased critical summaries by domain experts, serving as a reference for evaluation.
Our evaluation reveals that LLMs still face challenges in balancing sentiment and format adherence in long-form summaries, though open-source models can narrow the gap when relevant information is retrieved in a focused manner.
arXiv Detail & Related papers (2024-10-16T20:52:39Z) - PanoSent: A Panoptic Sextuple Extraction Benchmark for Multimodal Conversational Aspect-based Sentiment Analysis [74.41260927676747]
This paper bridges the gaps by introducing a multimodal conversational Sentiment Analysis (ABSA)
To benchmark the tasks, we construct PanoSent, a dataset annotated both manually and automatically, featuring high quality, large scale, multimodality, multilingualism, multi-scenarios, and covering both implicit and explicit sentiment elements.
To effectively address the tasks, we devise a novel Chain-of-Sentiment reasoning framework, together with a novel multimodal large language model (namely Sentica) and a paraphrase-based verification mechanism.
arXiv Detail & Related papers (2024-08-18T13:51:01Z) - Understanding Cross-Lingual Alignment -- A Survey [52.572071017877704]
Cross-lingual alignment is the meaningful similarity of representations across languages in multilingual language models.
We survey the literature of techniques to improve cross-lingual alignment, providing a taxonomy of methods and summarising insights from throughout the field.
arXiv Detail & Related papers (2024-04-09T11:39:53Z) - Long Dialog Summarization: An Analysis [28.223798877781054]
This work emphasizes the significance of creating coherent and contextually rich summaries for effective communication in various applications.
We explore current state-of-the-art approaches for long dialog summarization in different domains and benchmark metrics based evaluations show that one single model does not perform well across various areas for distinct summarization tasks.
arXiv Detail & Related papers (2024-02-26T19:35:45Z) - Dialogue Quality and Emotion Annotations for Customer Support
Conversations [7.218791626731783]
This paper presents a holistic annotation approach for emotion and conversational quality in the context of bilingual customer support conversations.
It provides a unique and valuable resource for the development of text classification models.
arXiv Detail & Related papers (2023-11-23T10:56:14Z) - Improving Factuality and Reasoning in Language Models through Multiagent
Debate [95.10641301155232]
We present a complementary approach to improve language responses where multiple language model instances propose and debate their individual responses and reasoning processes over multiple rounds to arrive at a common final answer.
Our findings indicate that this approach significantly enhances mathematical and strategic reasoning across a number of tasks.
Our approach may be directly applied to existing black-box models and uses identical procedure and prompts for all tasks we investigate.
arXiv Detail & Related papers (2023-05-23T17:55:11Z) - OCRBench: On the Hidden Mystery of OCR in Large Multimodal Models [122.27878464009181]
We conducted a comprehensive evaluation of Large Multimodal Models, such as GPT4V and Gemini, in various text-related visual tasks.
OCRBench contains 29 datasets, making it the most comprehensive OCR evaluation benchmark available.
arXiv Detail & Related papers (2023-05-13T11:28:37Z) - Abstractive Meeting Summarization: A Survey [15.455647477995306]
A system that could reliably identify and sum up the most important points of a conversation would be valuable in a wide variety of real-world contexts.
Recent advances in deep learning has significantly improved language generation systems, opening the door to improved forms of abstractive summarization.
We provide an overview of the challenges raised by the task of abstractive meeting summarization and of the data sets, models and evaluation metrics that have been used to tackle the problems.
arXiv Detail & Related papers (2022-08-08T14:04:38Z) - A Case Study and Qualitative Analysis of Simple Cross-Lingual Opinion
Mining [0.0]
We propose a method for building a single topic model with sentiment analysis capable of covering multiple languages simultanteously.
We apply the model to newspaper articles and user comments of a specific domain, i.e., organic food products.
We obtain a high proportion of stable and domain-relevant topics, a meaningful relation between topics and their respective contents, and an interpretable representation for social media documents.
arXiv Detail & Related papers (2021-11-03T14:49:50Z) - Topic-Oriented Spoken Dialogue Summarization for Customer Service with
Saliency-Aware Topic Modeling [61.67321200994117]
In a customer service system, dialogue summarization can boost service efficiency by creating summaries for long spoken dialogues.
In this work, we focus on topic-oriented dialogue summarization, which generates highly abstractive summaries.
We propose a novel topic-augmented two-stage dialogue summarizer ( TDS) jointly with a saliency-aware neural topic model (SATM) for topic-oriented summarization of customer service dialogues.
arXiv Detail & Related papers (2020-12-14T07:50:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.