Quality-Diversity through AI Feedback
- URL: http://arxiv.org/abs/2310.13032v4
- Date: Thu, 7 Dec 2023 19:56:21 GMT
- Title: Quality-Diversity through AI Feedback
- Authors: Herbie Bradley, Andrew Dai, Hannah Teufel, Jenny Zhang, Koen
Oostermeijer, Marco Bellagente, Jeff Clune, Kenneth Stanley, Gr\'egory
Schott, Joel Lehman
- Abstract summary: Quality-diversity (QD) search algorithms aim at continually improving and diversifying a population of candidates.
Recent developments in language models (LMs) have enabled guiding search through AI feedback.
QDAIF is a step towards AI systems that can independently search, diversify, evaluate, and improve.
- Score: 10.423093353553217
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In many text-generation problems, users may prefer not only a single
response, but a diverse range of high-quality outputs from which to choose.
Quality-diversity (QD) search algorithms aim at such outcomes, by continually
improving and diversifying a population of candidates. However, the
applicability of QD to qualitative domains, like creative writing, has been
limited by the difficulty of algorithmically specifying measures of quality and
diversity. Interestingly, recent developments in language models (LMs) have
enabled guiding search through AI feedback, wherein LMs are prompted in natural
language to evaluate qualitative aspects of text. Leveraging this development,
we introduce Quality-Diversity through AI Feedback (QDAIF), wherein an
evolutionary algorithm applies LMs to both generate variation and evaluate the
quality and diversity of candidate text. When assessed on creative writing
domains, QDAIF covers more of a specified search space with high-quality
samples than do non-QD controls. Further, human evaluation of QDAIF-generated
creative texts validates reasonable agreement between AI and human evaluation.
Our results thus highlight the potential of AI feedback to guide open-ended
search for creative and original solutions, providing a recipe that seemingly
generalizes to many domains and modalities. In this way, QDAIF is a step
towards AI systems that can independently search, diversify, evaluate, and
improve, which are among the core skills underlying human society's capacity
for innovation.
Related papers
- Who Writes the Review, Human or AI? [0.36498648388765503]
This study proposes a methodology to accurately distinguish AI-generated and human-written book reviews.
Our approach utilizes transfer learning, enabling the model to identify generated text across different topics.
The experimental results demonstrate that it is feasible to detect the original source of text, achieving an accuracy rate of 96.86%.
arXiv Detail & Related papers (2024-05-30T17:38:44Z) - Understanding and Evaluating Human Preferences for AI Generated Images with Instruction Tuning [58.41087653543607]
We first establish a novel Image Quality Assessment (IQA) database for AIGIs, termed AIGCIQA2023+.
This paper presents a MINT-IQA model to evaluate and explain human preferences for AIGIs from Multi-perspectives with INstruction Tuning.
arXiv Detail & Related papers (2024-05-12T17:45:11Z) - Large Language Models as In-context AI Generators for Quality-Diversity [8.585387103144825]
In-context QD aims to generate interesting solutions using few-shot and many-shot prompting with quality-diverse examples from the QD archive as context.
In-context QD displays promising results compared to both QD baselines and similar strategies developed for single-objective optimization.
arXiv Detail & Related papers (2024-04-24T10:35:36Z) - Towards Possibilities & Impossibilities of AI-generated Text Detection:
A Survey [97.33926242130732]
Large Language Models (LLMs) have revolutionized the domain of natural language processing (NLP) with remarkable capabilities of generating human-like text responses.
Despite these advancements, several works in the existing literature have raised serious concerns about the potential misuse of LLMs.
To address these concerns, a consensus among the research community is to develop algorithmic solutions to detect AI-generated text.
arXiv Detail & Related papers (2023-10-23T18:11:32Z) - Quality Diversity through Human Feedback: Towards Open-Ended Diversity-Driven Optimization [13.436983663467938]
This paper introduces Quality Diversity through Human Feedback (QDHF), a novel approach that progressively infers diversity metrics from human judgments of similarity among solutions.
Empirical studies show that QDHF significantly outperforms state-of-the-art methods in automatic diversity discovery.
In open-ended generative tasks, QDHF substantially enhances the diversity of text-to-image generation from a diffusion model.
arXiv Detail & Related papers (2023-10-18T16:46:16Z) - ChatEval: Towards Better LLM-based Evaluators through Multi-Agent Debate [57.71597869337909]
We build a multi-agent referee team called ChatEval to autonomously discuss and evaluate the quality of generated responses from different models.
Our analysis shows that ChatEval transcends mere textual scoring, offering a human-mimicking evaluation process for reliable assessments.
arXiv Detail & Related papers (2023-08-14T15:13:04Z) - Intrinsic Dimension Estimation for Robust Detection of AI-Generated
Texts [22.852855047237153]
We show that the average intrinsic dimensionality of fluent texts in a natural language is hovering around the value $9$ for several alphabet-based languages and around $7$ for Chinese.
This property allows us to build a score-based artificial text detector.
arXiv Detail & Related papers (2023-06-07T18:38:04Z) - Towards Robust Text-Prompted Semantic Criterion for In-the-Wild Video
Quality Assessment [54.31355080688127]
We introduce a text-prompted Semantic Affinity Quality Index (SAQI) and its localized version (SAQI-Local) using Contrastive Language-Image Pre-training (CLIP)
BVQI-Local demonstrates unprecedented performance, surpassing existing zero-shot indices by at least 24% on all datasets.
We conduct comprehensive analyses to investigate different quality concerns of distinct indices, demonstrating the effectiveness and rationality of our design.
arXiv Detail & Related papers (2023-04-28T08:06:05Z) - MAILS -- Meta AI Literacy Scale: Development and Testing of an AI
Literacy Questionnaire Based on Well-Founded Competency Models and
Psychological Change- and Meta-Competencies [6.368014180870025]
The questionnaire should be modular (i.e., including different facets that can be used independently of each other) to be flexibly applicable in professional life.
We derived 60 items to represent different facets of AI Literacy according to Ng and colleagues conceptualisation of AI literacy.
Additional 12 items to represent psychological competencies such as problem solving, learning, and emotion regulation in regard to AI.
arXiv Detail & Related papers (2023-02-18T12:35:55Z) - The Role of AI in Drug Discovery: Challenges, Opportunities, and
Strategies [97.5153823429076]
The benefits, challenges and drawbacks of AI in this field are reviewed.
The use of data augmentation, explainable AI, and the integration of AI with traditional experimental methods are also discussed.
arXiv Detail & Related papers (2022-12-08T23:23:39Z) - Improving the Question Answering Quality using Answer Candidate
Filtering based on Natural-Language Features [117.44028458220427]
We address the problem of how the Question Answering (QA) quality of a given system can be improved.
Our main contribution is an approach capable of identifying wrong answers provided by a QA system.
In particular, our approach has shown its potential while removing in many cases the majority of incorrect answers.
arXiv Detail & Related papers (2021-12-10T11:09:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.