Related papers: A Survey on Personalized Content Synthesis with Diffusion Models

A Survey on Personalized Content Synthesis with Diffusion Models

URL: http://arxiv.org/abs/2405.05538v1
Date: Thu, 9 May 2024 04:36:04 GMT
Title: A Survey on Personalized Content Synthesis with Diffusion Models
Authors: Xulu Zhang, Xiao-Yong Wei, Wengyu Zhang, Jinlin Wu, Zhaoxiang Zhang, Zhen Lei, Qing Li,
Abstract summary: PCS aims to customize the subject of interest to specific user-defined prompts. Over the past two years, more than 150 methods have been proposed. This paper offers a comprehensive survey of PCS, with a particular focus on the diffusion models.
Score: 57.01364199734464
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Recent advancements in generative models have significantly impacted content creation, leading to the emergence of Personalized Content Synthesis (PCS). With a small set of user-provided examples, PCS aims to customize the subject of interest to specific user-defined prompts. Over the past two years, more than 150 methods have been proposed. However, existing surveys mainly focus on text-to-image generation, with few providing up-to-date summaries on PCS. This paper offers a comprehensive survey of PCS, with a particular focus on the diffusion models. Specifically, we introduce the generic frameworks of PCS research, which can be broadly classified into optimization-based and learning-based approaches. We further categorize and analyze these methodologies, discussing their strengths, limitations, and key techniques. Additionally, we delve into specialized tasks within the field, such as personalized object generation, face synthesis, and style personalization, highlighting their unique challenges and innovations. Despite encouraging progress, we also present an analysis of the challenges such as overfitting and the trade-off between subject fidelity and text alignment. Through this detailed overview and analysis, we propose future directions to advance the development of PCS.

Related papers

Bridging Text and Video Generation: A Survey [0.41998444721319217]
Text-to-video technology holds potential to transform domains such as education, marketing, entertainment, and assistive technologies for individuals with visual or reading comprehension challenges.<n>We present a comprehensive survey of text-to-video generative models, tracing their development from early GANs and VAEs to hybrid Diffusion-Transformer (DiT) architectures.<n>We provide a systematic account of the datasets, which the surveyed text-to-video models were trained and evaluated on, and to support and assess the accessibility of training such models.
arXiv Detail & Related papers (2025-10-06T16:39:05Z)
Personalized Generation In Large Model Era: A Survey [90.7579254803302]
In the era of large models, content generation is gradually shifting to Personalized Generation (PGen) This paper presents the first comprehensive survey on PGen, investigating existing research in this rapidly growing field. By bridging PGen research across multiple modalities, this survey serves as a valuable resource for fostering knowledge sharing and interdisciplinary collaboration.
arXiv Detail & Related papers (2025-03-04T13:34:19Z)
Data Analysis in the Era of Generative AI [56.44807642944589]
This paper explores the potential of AI-powered tools to reshape data analysis, focusing on design considerations and challenges. We explore how the emergence of large language and multimodal models offers new opportunities to enhance various stages of data analysis workflow. We then examine human-centered design principles that facilitate intuitive interactions, build user trust, and streamline the AI-assisted analysis workflow across multiple apps.
arXiv Detail & Related papers (2024-09-27T06:31:03Z)
Civiverse: A Dataset for Analyzing User Engagement with Open-Source Text-to-Image Models [0.7209758868768352]
We analyze the Civiverse prompt dataset, encompassing millions of images and related metadata. We focus on prompt analysis, specifically examining the semantic characteristics of text prompts. Our findings reveal a predominant preference for generating explicit content, along with a focus on homogenization of semantic content.
arXiv Detail & Related papers (2024-08-10T21:41:03Z)
Self-Supervised Learning for Text Recognition: A Critical Survey [11.599791967838481]
Text Recognition (TR) refers to the research area that focuses on retrieving textual information from images. Self-Supervised Learning (SSL) has gained attention by utilizing large datasets of unlabeled data to train Deep Neural Networks (DNN) This paper seeks to consolidate the use of SSL in the field of TR, offering a critical and comprehensive overview of the current state of the art.
arXiv Detail & Related papers (2024-07-29T11:11:17Z)
Privacy Preserving Prompt Engineering: A Survey [14.402638881376419]
Pre-trained language models (PLMs) have demonstrated significant proficiency in solving a wide range of general natural language processing (NLP) tasks. As a result, the sizes of these models have notably expanded in recent years. Privacy concerns have become a major obstacle in its widespread usage.
arXiv Detail & Related papers (2024-04-09T04:11:25Z)
User Modeling and User Profiling: A Comprehensive Survey [0.0]
This paper presents a survey of the current state, evolution, and future directions of user modeling and profiling research. We provide a historical overview, tracing the development from early stereotype models to the latest deep learning techniques. We also address the critical need for privacy-preserving techniques and the push towards explainability and fairness in user modeling approaches.
arXiv Detail & Related papers (2024-02-15T02:06:06Z)
Recent Advances in Predictive Modeling with Electronic Health Records [71.19967863320647]
utilizing EHR data for predictive modeling presents several challenges due to its unique characteristics. Deep learning has demonstrated its superiority in various applications, including healthcare.
arXiv Detail & Related papers (2024-02-02T00:31:01Z)
Fine-Grained Zero-Shot Learning: Advances, Challenges, and Prospects [84.36935309169567]
We present a broad review of recent advances for fine-grained analysis in zero-shot learning (ZSL) We first provide a taxonomy of existing methods and techniques with a thorough analysis of each category. Then, we summarize the benchmark, covering publicly available datasets, models, implementations, and some more details as a library.
arXiv Detail & Related papers (2024-01-31T11:51:24Z)
SoK: Privacy-Preserving Data Synthesis [72.92263073534899]
This paper focuses on privacy-preserving data synthesis (PPDS) by providing a comprehensive overview, analysis, and discussion of the field. We put forth a master recipe that unifies two prominent strands of research in PPDS: statistical methods and deep learning (DL)-based methods.
arXiv Detail & Related papers (2023-07-05T08:29:31Z)
Geometric Deep Learning for Structure-Based Drug Design: A Survey [83.87489798671155]
Structure-based drug design (SBDD) leverages the three-dimensional geometry of proteins to identify potential drug candidates. Recent advancements in geometric deep learning, which effectively integrate and process 3D geometric data, have significantly propelled the field forward.
arXiv Detail & Related papers (2023-06-20T14:21:58Z)
Few Shot Semantic Segmentation: a review of methodologies, benchmarks, and open challenges [5.0243930429558885]
Few-Shot Semantic is a novel task in computer vision, which aims at designing models capable of segmenting new semantic classes with only a few examples. This paper consists of a comprehensive survey of Few-Shot Semantic, tracing its evolution and exploring various model designs.
arXiv Detail & Related papers (2023-04-12T13:07:37Z)
Recent Few-Shot Object Detection Algorithms: A Survey with Performance Comparison [54.357707168883024]
Few-Shot Object Detection (FSOD) mimics the humans' ability of learning to learn. FSOD intelligently transfers the learned generic object knowledge from the common heavy-tailed, to the novel long-tailed object classes. We give an overview of FSOD, including the problem definition, common datasets, and evaluation protocols.
arXiv Detail & Related papers (2022-03-27T04:11:28Z)

This list is automatically generated from the titles and abstracts of the papers in this site.