Human Preference Score v2: A Solid Benchmark for Evaluating Human
Preferences of Text-to-Image Synthesis
- URL: http://arxiv.org/abs/2306.09341v2
- Date: Mon, 25 Sep 2023 08:19:23 GMT
- Title: Human Preference Score v2: A Solid Benchmark for Evaluating Human
Preferences of Text-to-Image Synthesis
- Authors: Xiaoshi Wu, Yiming Hao, Keqiang Sun, Yixiong Chen, Feng Zhu, Rui Zhao,
Hongsheng Li
- Abstract summary: Recent text-to-image generative models can generate high-fidelity images from text inputs.
HPD v2 captures human preferences on images from a wide range of sources.
HPD v2 comprises 798,090 human preference choices on 433,760 pairs of images.
- Score: 38.70605308204128
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Recent text-to-image generative models can generate high-fidelity images from
text inputs, but the quality of these generated images cannot be accurately
evaluated by existing evaluation metrics. To address this issue, we introduce
Human Preference Dataset v2 (HPD v2), a large-scale dataset that captures human
preferences on images from a wide range of sources. HPD v2 comprises 798,090
human preference choices on 433,760 pairs of images, making it the largest
dataset of its kind. The text prompts and images are deliberately collected to
eliminate potential bias, which is a common issue in previous datasets. By
fine-tuning CLIP on HPD v2, we obtain Human Preference Score v2 (HPS v2), a
scoring model that can more accurately predict human preferences on generated
images. Our experiments demonstrate that HPS v2 generalizes better than
previous metrics across various image distributions and is responsive to
algorithmic improvements of text-to-image generative models, making it a
preferable evaluation metric for these models. We also investigate the design
of the evaluation prompts for text-to-image generative models, to make the
evaluation stable, fair and easy-to-use. Finally, we establish a benchmark for
text-to-image generative models using HPS v2, which includes a set of recent
text-to-image models from the academic, community and industry. The code and
dataset is available at https://github.com/tgxs002/HPSv2 .
Related papers
- Scalable Ranked Preference Optimization for Text-to-Image Generation [76.16285931871948]
We investigate a scalable approach for collecting large-scale and fully synthetic datasets for DPO training.
The preferences for paired images are generated using a pre-trained reward function, eliminating the need for involving humans in the annotation process.
We introduce RankDPO to enhance DPO-based methods using the ranking feedback.
arXiv Detail & Related papers (2024-10-23T16:42:56Z) - Learning Multi-dimensional Human Preference for Text-to-Image Generation [18.10755131392223]
We propose the Multi-dimensional Preference Score (MPS), the first multi-dimensional preference scoring model for the evaluation of text-to-image models.
The MPS introduces the preference condition module upon CLIP model to learn these diverse preferences.
It is trained based on our Multi-dimensional Human Preference (MHP) dataset, which comprises 918,315 human preference choices across four dimensions.
arXiv Detail & Related papers (2024-05-23T15:39:43Z) - Confidence-aware Reward Optimization for Fine-tuning Text-to-Image Models [85.96013373385057]
Fine-tuning text-to-image models with reward functions trained on human feedback data has proven effective for aligning model behavior with human intent.
However, excessive optimization with such reward models, which serve as mere proxy objectives, can compromise the performance of fine-tuned models.
We propose TextNorm, a method that enhances alignment based on a measure of reward model confidence estimated across a set of semantically contrastive text prompts.
arXiv Detail & Related papers (2024-04-02T11:40:38Z) - Filter & Align: Leveraging Human Knowledge to Curate Image-Text Data [31.507451966555383]
We present a novel algorithm that incorporates human knowledge on image-text alignment to guide filtering vast corpus of web-crawled image-text datasets.
We collect a diverse image-text dataset where each image is associated with multiple captions from various sources.
We train a reward model on these human-preference annotations to internalize the nuanced human understanding of image-text alignment.
arXiv Detail & Related papers (2023-12-11T05:57:09Z) - Human Preference Score: Better Aligning Text-to-Image Models with Human
Preference [41.270068272447055]
We collect a dataset of human choices on generated images from the Stable Foundation Discord channel.
Our experiments demonstrate that current evaluation metrics for generative models do not correlate well with human choices.
We propose a simple yet effective method to adapt Stable Diffusion to better align with human preferences.
arXiv Detail & Related papers (2023-03-25T10:09:03Z) - Aligning Text-to-Image Models using Human Feedback [104.76638092169604]
Current text-to-image models often generate images that are inadequately aligned with text prompts.
We propose a fine-tuning method for aligning such models using human feedback.
Our results demonstrate the potential for learning from human feedback to significantly improve text-to-image models.
arXiv Detail & Related papers (2023-02-23T17:34:53Z) - Lafite2: Few-shot Text-to-Image Generation [132.14211027057766]
We propose a novel method for pre-training text-to-image generation model on image-only datasets.
It considers a retrieval-then-optimization procedure to synthesize pseudo text features.
It can be beneficial to a wide range of settings, including the few-shot, semi-supervised and fully-supervised learning.
arXiv Detail & Related papers (2022-10-25T16:22:23Z) - Photorealistic Text-to-Image Diffusion Models with Deep Language
Understanding [53.170767750244366]
Imagen is a text-to-image diffusion model with an unprecedented degree of photorealism and a deep level of language understanding.
To assess text-to-image models in greater depth, we introduce DrawBench, a comprehensive and challenging benchmark for text-to-image models.
arXiv Detail & Related papers (2022-05-23T17:42:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.