Quality-Diversity through AI Feedback
- URL: http://arxiv.org/abs/2310.13032v4
- Date: Thu, 7 Dec 2023 19:56:21 GMT
- Title: Quality-Diversity through AI Feedback
- Authors: Herbie Bradley, Andrew Dai, Hannah Teufel, Jenny Zhang, Koen
Oostermeijer, Marco Bellagente, Jeff Clune, Kenneth Stanley, Gr\'egory
Schott, Joel Lehman
- Abstract summary: Quality-diversity (QD) search algorithms aim at continually improving and diversifying a population of candidates.
Recent developments in language models (LMs) have enabled guiding search through AI feedback.
QDAIF is a step towards AI systems that can independently search, diversify, evaluate, and improve.
- Score: 10.423093353553217
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In many text-generation problems, users may prefer not only a single
response, but a diverse range of high-quality outputs from which to choose.
Quality-diversity (QD) search algorithms aim at such outcomes, by continually
improving and diversifying a population of candidates. However, the
applicability of QD to qualitative domains, like creative writing, has been
limited by the difficulty of algorithmically specifying measures of quality and
diversity. Interestingly, recent developments in language models (LMs) have
enabled guiding search through AI feedback, wherein LMs are prompted in natural
language to evaluate qualitative aspects of text. Leveraging this development,
we introduce Quality-Diversity through AI Feedback (QDAIF), wherein an
evolutionary algorithm applies LMs to both generate variation and evaluate the
quality and diversity of candidate text. When assessed on creative writing
domains, QDAIF covers more of a specified search space with high-quality
samples than do non-QD controls. Further, human evaluation of QDAIF-generated
creative texts validates reasonable agreement between AI and human evaluation.
Our results thus highlight the potential of AI feedback to guide open-ended
search for creative and original solutions, providing a recipe that seemingly
generalizes to many domains and modalities. In this way, QDAIF is a step
towards AI systems that can independently search, diversify, evaluate, and
improve, which are among the core skills underlying human society's capacity
for innovation.
Related papers
- AI-Generated Image Quality Assessment Based on Task-Specific Prompt and Multi-Granularity Similarity [62.00987205438436]
We propose a novel quality assessment method for AIGIs named TSP-MGS.
It designs task-specific prompts and measures multi-granularity similarity between AIGIs and the prompts.
Experiments on the commonly used AGIQA-1K and AGIQA-3K benchmarks demonstrate the superiority of the proposed TSP-MGS.
arXiv Detail & Related papers (2024-11-25T04:47:53Z) - The Future of Learning in the Age of Generative AI: Automated Question Generation and Assessment with Large Language Models [0.0]
Large language models (LLMs) and generative AI have revolutionized natural language processing (NLP)
This chapter explores the transformative potential of LLMs in automated question generation and answer assessment.
arXiv Detail & Related papers (2024-10-12T15:54:53Z) - Do great minds think alike? Investigating Human-AI Complementarity in Question Answering with CAIMIRA [43.116608441891096]
Humans outperform AI systems in knowledge-grounded abductive and conceptual reasoning.
State-of-the-art LLMs like GPT-4 and LLaMA show superior performance on targeted information retrieval.
arXiv Detail & Related papers (2024-10-09T03:53:26Z) - Decoding AI and Human Authorship: Nuances Revealed Through NLP and Statistical Analysis [0.0]
This research explores the nuanced differences in texts produced by AI and those written by humans.
The study investigates various linguistic traits, patterns of creativity, and potential biases inherent in human-written and AI- generated texts.
arXiv Detail & Related papers (2024-07-15T18:09:03Z) - Understanding and Evaluating Human Preferences for AI Generated Images with Instruction Tuning [58.41087653543607]
We first establish a novel Image Quality Assessment (IQA) database for AIGIs, termed AIGCIQA2023+.
This paper presents a MINT-IQA model to evaluate and explain human preferences for AIGIs from Multi-perspectives with INstruction Tuning.
arXiv Detail & Related papers (2024-05-12T17:45:11Z) - Large Language Models as In-context AI Generators for Quality-Diversity [8.585387103144825]
In-context QD aims to generate interesting solutions using few-shot and many-shot prompting with quality-diverse examples from the QD archive as context.
In-context QD displays promising results compared to both QD baselines and similar strategies developed for single-objective optimization.
arXiv Detail & Related papers (2024-04-24T10:35:36Z) - Quality Diversity through Human Feedback: Towards Open-Ended Diversity-Driven Optimization [13.436983663467938]
This paper introduces Quality Diversity through Human Feedback (QDHF), a novel approach that progressively infers diversity metrics from human judgments of similarity among solutions.
Empirical studies show that QDHF significantly outperforms state-of-the-art methods in automatic diversity discovery.
In open-ended generative tasks, QDHF substantially enhances the diversity of text-to-image generation from a diffusion model.
arXiv Detail & Related papers (2023-10-18T16:46:16Z) - ChatEval: Towards Better LLM-based Evaluators through Multi-Agent Debate [57.71597869337909]
We build a multi-agent referee team called ChatEval to autonomously discuss and evaluate the quality of generated responses from different models.
Our analysis shows that ChatEval transcends mere textual scoring, offering a human-mimicking evaluation process for reliable assessments.
arXiv Detail & Related papers (2023-08-14T15:13:04Z) - Towards Robust Text-Prompted Semantic Criterion for In-the-Wild Video
Quality Assessment [54.31355080688127]
We introduce a text-prompted Semantic Affinity Quality Index (SAQI) and its localized version (SAQI-Local) using Contrastive Language-Image Pre-training (CLIP)
BVQI-Local demonstrates unprecedented performance, surpassing existing zero-shot indices by at least 24% on all datasets.
We conduct comprehensive analyses to investigate different quality concerns of distinct indices, demonstrating the effectiveness and rationality of our design.
arXiv Detail & Related papers (2023-04-28T08:06:05Z) - The Role of AI in Drug Discovery: Challenges, Opportunities, and
Strategies [97.5153823429076]
The benefits, challenges and drawbacks of AI in this field are reviewed.
The use of data augmentation, explainable AI, and the integration of AI with traditional experimental methods are also discussed.
arXiv Detail & Related papers (2022-12-08T23:23:39Z) - Improving the Question Answering Quality using Answer Candidate
Filtering based on Natural-Language Features [117.44028458220427]
We address the problem of how the Question Answering (QA) quality of a given system can be improved.
Our main contribution is an approach capable of identifying wrong answers provided by a QA system.
In particular, our approach has shown its potential while removing in many cases the majority of incorrect answers.
arXiv Detail & Related papers (2021-12-10T11:09:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.