Better Handling Coreference Resolution in Aspect Level Sentiment
Classification by Fine-Tuning Language Models
- URL: http://arxiv.org/abs/2307.05646v1
- Date: Tue, 11 Jul 2023 12:43:28 GMT
- Title: Better Handling Coreference Resolution in Aspect Level Sentiment
Classification by Fine-Tuning Language Models
- Authors: Dhruv Mullick, Bilal Ghanem, Alona Fyshe
- Abstract summary: Monitoring customer feedback can be automated with Aspect Level Sentiment Classification (ALSC)
Large Language Models (LLMs) are the heart of many state-of-the-art ALSC solutions, but they perform poorly in some scenarios requiring Coreference Resolution (CR)
We propose a framework to improve an LLM's performance on CR-containing reviews by fine tuning on highly inferential tasks.
- Score: 4.2605449879340656
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Customer feedback is invaluable to companies as they refine their products.
Monitoring customer feedback can be automated with Aspect Level Sentiment
Classification (ALSC) which allows us to analyse specific aspects of the
products in reviews. Large Language Models (LLMs) are the heart of many
state-of-the-art ALSC solutions, but they perform poorly in some scenarios
requiring Coreference Resolution (CR). In this work, we propose a framework to
improve an LLM's performance on CR-containing reviews by fine tuning on highly
inferential tasks. We show that the performance improvement is likely
attributed to the improved model CR ability. We also release a new dataset that
focuses on CR in ALSC.
Related papers
- RMB: Comprehensively Benchmarking Reward Models in LLM Alignment [44.84304822376291]
Reward models (RMs) guide the alignment of large language models (LLMs)
We propose RMB, a comprehensive RM benchmark that covers over 49 real-world scenarios.
Based on our benchmark, we conduct extensive analysis on the state-of-the-art RMs.
arXiv Detail & Related papers (2024-10-13T16:06:54Z) - Large Language Models for Page Stream Segmentation [0.03495246564946555]
Page Stream (PSS) is an essential prerequisite for automated document processing at scale.
This paper introduces TABME++, an enhanced benchmark featuring commercial Optical Character Recognition (OCR) annotations.
We evaluate the performance of large language models (LLMs) on PSS, focusing on decoder-based models fine-tuned with parameter-efficient methods.
arXiv Detail & Related papers (2024-08-21T20:28:42Z) - Learning to Refine with Fine-Grained Natural Language Feedback [81.70313509881315]
We propose looking at refinement with feedback as a composition of three distinct LLM competencies.
A key property of the proposed Detect, Critique, Refine ("DCR") method is that the step 2 critique model can give fine-grained feedback about errors.
We show that models of different capabilities benefit from refining with DCR on the task of improving factual consistency of document grounded summaries.
arXiv Detail & Related papers (2024-07-02T16:15:01Z) - A Thorough Performance Benchmarking on Lightweight Embedding-based Recommender Systems [67.52782366565658]
State-of-the-art recommender systems (RSs) depend on categorical features, which ecoded by embedding vectors, resulting in excessively large embedding tables.
Despite the prosperity of lightweight embedding-based RSs, a wide diversity is seen in evaluation protocols.
This study investigates various LERS' performance, efficiency, and cross-task transferability via a thorough benchmarking process.
arXiv Detail & Related papers (2024-06-25T07:45:00Z) - ConMe: Rethinking Evaluation of Compositional Reasoning for Modern VLMs [95.15814662348245]
Compositional Reasoning (CR) entails grasping the significance of attributes, relations, and word order.
Recent Vision-Language Models (VLMs) have demonstrated remarkable proficiency in such reasoning tasks.
arXiv Detail & Related papers (2024-06-12T12:54:27Z) - Calibrated Self-Rewarding Vision Language Models [27.686545023186852]
Large Vision-Language Models (LVLMs) have made substantial progress by integrating pre-trained large language models (LLMs) and vision models through instruction tuning.
LVLMs often exhibit the hallucination phenomenon, where generated text responses appear linguistically plausible but contradict the input image.
We propose the Calibrated Self-Rewarding (CSR) approach, which enables the model to self-improve by iteratively generating candidate responses, evaluating the reward for each response, and curating preference data for fine-tuning.
arXiv Detail & Related papers (2024-05-23T14:30:33Z) - CATfOOD: Counterfactual Augmented Training for Improving Out-of-Domain
Performance and Calibration [59.48235003469116]
We show that data augmentation consistently enhances OOD performance.
We also show that CF augmented models which are easier to calibrate also exhibit much lower entropy when assigning importance.
arXiv Detail & Related papers (2023-09-14T16:16:40Z) - Towards Automated Classification of Code Review Feedback to Support
Analytics [4.423428708304586]
This study aims to develop an automated code review comment classification system.
We trained and evaluated supervised learning-based DNN models leveraging code context, comment text, and a set of code metrics.
Our approach outperforms Fregnan et al.'s approach by achieving 18.7% higher accuracy.
arXiv Detail & Related papers (2023-07-07T21:53:20Z) - High Quality Segmentation for Ultra High-resolution Images [72.97958314291648]
We propose the Continuous Refinement Model for the ultra high-resolution segmentation refinement task.
Our proposed method is fast and effective on image segmentation refinement.
arXiv Detail & Related papers (2021-11-29T11:53:06Z) - CRACT: Cascaded Regression-Align-Classification for Robust Visual
Tracking [97.84109669027225]
We introduce an improved proposal refinement module, Cascaded Regression-Align- Classification (CRAC)
CRAC yields new state-of-the-art performances on many benchmarks.
In experiments on seven benchmarks including OTB-2015, UAV123, NfS, VOT-2018, TrackingNet, GOT-10k and LaSOT, our CRACT exhibits very promising results in comparison with state-of-the-art competitors.
arXiv Detail & Related papers (2020-11-25T02:18:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.