Better Handling Coreference Resolution in Aspect Level Sentiment
Classification by Fine-Tuning Language Models
- URL: http://arxiv.org/abs/2307.05646v1
- Date: Tue, 11 Jul 2023 12:43:28 GMT
- Title: Better Handling Coreference Resolution in Aspect Level Sentiment
Classification by Fine-Tuning Language Models
- Authors: Dhruv Mullick, Bilal Ghanem, Alona Fyshe
- Abstract summary: Monitoring customer feedback can be automated with Aspect Level Sentiment Classification (ALSC)
Large Language Models (LLMs) are the heart of many state-of-the-art ALSC solutions, but they perform poorly in some scenarios requiring Coreference Resolution (CR)
We propose a framework to improve an LLM's performance on CR-containing reviews by fine tuning on highly inferential tasks.
- Score: 4.2605449879340656
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Customer feedback is invaluable to companies as they refine their products.
Monitoring customer feedback can be automated with Aspect Level Sentiment
Classification (ALSC) which allows us to analyse specific aspects of the
products in reviews. Large Language Models (LLMs) are the heart of many
state-of-the-art ALSC solutions, but they perform poorly in some scenarios
requiring Coreference Resolution (CR). In this work, we propose a framework to
improve an LLM's performance on CR-containing reviews by fine tuning on highly
inferential tasks. We show that the performance improvement is likely
attributed to the improved model CR ability. We also release a new dataset that
focuses on CR in ALSC.
Related papers
- RealCritic: Towards Effectiveness-Driven Evaluation of Language Model Critiques [59.861013614500024]
We introduce a new benchmark designed to assess the critique capabilities of Large Language Models (LLMs)
Unlike existing benchmarks, which typically function in an open-loop fashion, our approach employs a closed-loop methodology that evaluates the quality of corrections generated from critiques.
arXiv Detail & Related papers (2025-01-24T13:48:10Z) - Enabling Scalable Oversight via Self-Evolving Critic [59.861013614500024]
SCRIT (Self-evolving CRITic) is a framework that enables genuine self-evolution of critique abilities.
It self-improves by training on synthetic data, generated by a contrastive-based self-critic.
It achieves up to a 10.3% improvement on critique-correction and error identification benchmarks.
arXiv Detail & Related papers (2025-01-10T05:51:52Z) - EACO: Enhancing Alignment in Multimodal LLMs via Critical Observation [58.546205554954454]
We propose Enhancing Alignment in MLLMs via Critical Observation (EACO)
EACO aligns MLLMs by self-generated preference data using only 5k images economically.
EACO reduces the overall hallucinations by 65.6% on HallusionBench and improves the reasoning ability by 21.8% on MME-Cognition.
arXiv Detail & Related papers (2024-12-06T09:59:47Z) - ConMe: Rethinking Evaluation of Compositional Reasoning for Modern VLMs [95.15814662348245]
Compositional Reasoning (CR) entails grasping the significance of attributes, relations, and word order.
Recent Vision-Language Models (VLMs) have demonstrated remarkable proficiency in such reasoning tasks.
arXiv Detail & Related papers (2024-06-12T12:54:27Z) - Calibrated Self-Rewarding Vision Language Models [27.686545023186852]
Large Vision-Language Models (LVLMs) have made substantial progress by integrating pre-trained large language models (LLMs) and vision models through instruction tuning.
LVLMs often exhibit the hallucination phenomenon, where generated text responses appear linguistically plausible but contradict the input image.
We propose the Calibrated Self-Rewarding (CSR) approach, which enables the model to self-improve by iteratively generating candidate responses, evaluating the reward for each response, and curating preference data for fine-tuning.
arXiv Detail & Related papers (2024-05-23T14:30:33Z) - CATfOOD: Counterfactual Augmented Training for Improving Out-of-Domain
Performance and Calibration [59.48235003469116]
We show that data augmentation consistently enhances OOD performance.
We also show that CF augmented models which are easier to calibrate also exhibit much lower entropy when assigning importance.
arXiv Detail & Related papers (2023-09-14T16:16:40Z) - Towards Automated Classification of Code Review Feedback to Support
Analytics [4.423428708304586]
This study aims to develop an automated code review comment classification system.
We trained and evaluated supervised learning-based DNN models leveraging code context, comment text, and a set of code metrics.
Our approach outperforms Fregnan et al.'s approach by achieving 18.7% higher accuracy.
arXiv Detail & Related papers (2023-07-07T21:53:20Z) - CRACT: Cascaded Regression-Align-Classification for Robust Visual
Tracking [97.84109669027225]
We introduce an improved proposal refinement module, Cascaded Regression-Align- Classification (CRAC)
CRAC yields new state-of-the-art performances on many benchmarks.
In experiments on seven benchmarks including OTB-2015, UAV123, NfS, VOT-2018, TrackingNet, GOT-10k and LaSOT, our CRACT exhibits very promising results in comparison with state-of-the-art competitors.
arXiv Detail & Related papers (2020-11-25T02:18:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.