Multi-Facet Counterfactual Learning for Content Quality Evaluation
- URL: http://arxiv.org/abs/2410.07693v1
- Date: Thu, 10 Oct 2024 08:04:10 GMT
- Title: Multi-Facet Counterfactual Learning for Content Quality Evaluation
- Authors: Jiasheng Zheng, Hongyu Lin, Boxi Cao, Meng Liao, Yaojie Lu, Xianpei Han, Le Sun,
- Abstract summary: We propose a framework for efficiently constructing evaluators that perceive multiple facets of content quality evaluation.
We leverage a joint training strategy based on contrastive learning and supervised learning to enable the evaluator to distinguish between different quality facets.
- Score: 48.73583736357489
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Evaluating the quality of documents is essential for filtering valuable content from the current massive amount of information. Conventional approaches typically rely on a single score as a supervision signal for training content quality evaluators, which is inadequate to differentiate documents with quality variations across multiple facets. In this paper, we propose Multi-facet cOunterfactual LEarning (MOLE), a framework for efficiently constructing evaluators that perceive multiple facets of content quality evaluation. Given a specific scenario, we prompt large language models to generate counterfactual content that exhibits variations in critical quality facets compared to the original document. Furthermore, we leverage a joint training strategy based on contrastive learning and supervised learning to enable the evaluator to distinguish between different quality facets, resulting in more accurate predictions of content quality scores. Experimental results on 2 datasets across different scenarios demonstrate that our proposed MOLE framework effectively improves the correlation of document content quality evaluations with human judgments, which serve as a valuable toolkit for effective information acquisition.
Related papers
- ARTICLE: Annotator Reliability Through In-Context Learning [18.818071256242327]
We propose texttARTICLE, an in-context learning framework to estimate annotation quality through self-consistency.
Our findings indicate that texttARTICLE can be used as a robust method for identifying reliable annotators, hence improving data quality.
arXiv Detail & Related papers (2024-09-18T17:59:32Z) - Mashee at SemEval-2024 Task 8: The Impact of Samples Quality on the Performance of In-Context Learning for Machine Text Classification [0.0]
We employ the chi-square test to identify high-quality samples and compare the results with those obtained using low-quality samples.
Our findings demonstrate that utilizing high-quality samples leads to improved performance with respect to all evaluated metrics.
arXiv Detail & Related papers (2024-05-28T12:47:43Z) - Exploring Precision and Recall to assess the quality and diversity of LLMs [82.21278402856079]
We introduce a novel evaluation framework for Large Language Models (LLMs) such as textscLlama-2 and textscMistral.
This approach allows for a nuanced assessment of the quality and diversity of generated text without the need for aligned corpora.
arXiv Detail & Related papers (2024-02-16T13:53:26Z) - QuRating: Selecting High-Quality Data for Training Language Models [64.83332850645074]
We introduce QuRating, a method for selecting pre-training data that can capture human intuitions about data quality.
In this paper, we investigate four qualities - writing style, required expertise, facts & trivia, and educational value.
We train a Qur model to learn scalar ratings from pairwise judgments, and use it to annotate a 260B training corpus with quality ratings for each of the four criteria.
arXiv Detail & Related papers (2024-02-15T06:36:07Z) - Q-Align: Teaching LMMs for Visual Scoring via Discrete Text-Defined
Levels [95.44077384918725]
We propose to teach large multi-modality models (LMMs) with text-defined rating levels instead of scores.
The proposed Q-Align achieves state-of-the-art performance on image quality assessment (IQA), image aesthetic assessment (IAA) and video quality assessment (VQA) tasks.
arXiv Detail & Related papers (2023-12-28T16:10:25Z) - Exploring the Use of Large Language Models for Reference-Free Text
Quality Evaluation: An Empirical Study [63.27346930921658]
ChatGPT is capable of evaluating text quality effectively from various perspectives without reference.
The Explicit Score, which utilizes ChatGPT to generate a numeric score measuring text quality, is the most effective and reliable method among the three exploited approaches.
arXiv Detail & Related papers (2023-04-03T05:29:58Z) - Evaluating and Improving Factuality in Multimodal Abstractive
Summarization [91.46015013816083]
We propose CLIPBERTScore to leverage the robustness and strong factuality detection performance between image-summary and document-summary.
We show that this simple combination of two metrics in the zero-shot achieves higher correlations than existing factuality metrics for document summarization.
Our analysis demonstrates the robustness and high correlation of CLIPBERTScore and its components on four factuality metric-evaluation benchmarks.
arXiv Detail & Related papers (2022-11-04T16:50:40Z) - Cognitive Representation Learning of Self-Media Online Article Quality [24.084727302752377]
Self-media online articles are mainly created by users, which have the appearance characteristics of different text levels and multi-modal hybrid editing.
We establish a joint model CoQAN in combination with the layout organization, writing characteristics and text semantics.
We have also constructed a large scale real-world assessment dataset.
arXiv Detail & Related papers (2020-08-13T02:59:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.