AI-Generated Content (AIGC) for Various Data Modalities: A Survey
- URL: http://arxiv.org/abs/2308.14177v4
- Date: Sat, 21 Oct 2023 15:45:04 GMT
- Title: AI-Generated Content (AIGC) for Various Data Modalities: A Survey
- Authors: Lin Geng Foo, Hossein Rahmani, Jun Liu
- Abstract summary: AIGC methods aim to produce text, images, videos, 3D assets, and other media using AI algorithms.
We provide a comprehensive review of AIGC methods across different data modalities, including both single-modality and cross-modality methods.
- Score: 17.787268628612765
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: AI-generated content (AIGC) methods aim to produce text, images, videos, 3D
assets, and other media using AI algorithms. Due to its wide range of
applications and the demonstrated potential of recent works, AIGC developments
have been attracting lots of attention recently, and AIGC methods have been
developed for various data modalities, such as image, video, text, 3D shape (as
voxels, point clouds, meshes, and neural implicit fields), 3D scene, 3D human
avatar (body and head), 3D motion, and audio -- each presenting different
characteristics and challenges. Furthermore, there have also been many
significant developments in cross-modality AIGC methods, where generative
methods can receive conditioning input in one modality and produce outputs in
another. Examples include going from various modalities to image, video, 3D
shape, 3D scene, 3D avatar (body and head), 3D motion (skeleton and avatar),
and audio modalities. In this paper, we provide a comprehensive review of AIGC
methods across different data modalities, including both single-modality and
cross-modality methods, highlighting the various challenges, representative
works, and recent technical directions in each setting. We also survey the
representative datasets throughout the modalities, and present comparative
results for various modalities. Moreover, we also discuss the challenges and
potential future research directions.
Related papers
- The Evolution and Future Perspectives of Artificial Intelligence Generated Content [7.586328912947784]
Review traces AIGC's evolution through four developmental milestones.
This study aims to guide researchers and practitioners in selecting and optimizing AIGC models.
arXiv Detail & Related papers (2024-12-02T20:16:40Z) - Generative Artificial Intelligence Meets Synthetic Aperture Radar: A Survey [49.29751866761522]
This paper aims to investigate the intersection of GenAI and SAR.
First, we illustrate the common data generation-based applications in SAR field.
Then, an overview of the latest GenAI models is systematically reviewed.
Finally, the corresponding applications in SAR domain are also included.
arXiv Detail & Related papers (2024-11-05T03:06:00Z) - A Comprehensive Methodological Survey of Human Activity Recognition Across Divers Data Modalities [2.916558661202724]
Human Activity Recognition (HAR) systems aim to understand human behaviour and assign a label to each action.
HAR can leverage various data modalities, such as RGB images and video, skeleton, depth, infrared, point cloud, event stream, audio, acceleration, and radar signals.
This paper presents a comprehensive survey of the latest advancements in HAR from 2014 to 2024.
arXiv Detail & Related papers (2024-09-15T10:04:44Z) - 3D Gaussian Splatting: Survey, Technologies, Challenges, and Opportunities [57.444435654131006]
3D Gaussian Splatting (3DGS) has emerged as a prominent technique with the potential to become a mainstream method for 3D representations.
This survey aims to analyze existing 3DGS-related works from multiple intersecting perspectives.
arXiv Detail & Related papers (2024-07-24T16:53:17Z) - A Comprehensive Survey of 3D Dense Captioning: Localizing and Describing
Objects in 3D Scenes [80.20670062509723]
3D dense captioning is an emerging vision-language bridging task that aims to generate detailed descriptions for 3D scenes.
It presents significant potential and challenges due to its closer representation of the real world compared to 2D visual captioning.
Despite the popularity and success of existing methods, there is a lack of comprehensive surveys summarizing the advancements in this field.
arXiv Detail & Related papers (2024-03-12T10:04:08Z) - Bridging MDE and AI: A Systematic Review of Domain-Specific Languages and Model-Driven Practices in AI Software Systems Engineering [1.4853133497896698]
This study aims to investigate the existing model-driven approaches relying on DSL in support of the engineering of AI software systems.
The use of MDE for AI is still in its early stages, and there is no single tool or method that is widely used.
arXiv Detail & Related papers (2023-07-10T14:38:38Z) - A Comprehensive Survey of AI-Generated Content (AIGC): A History of
Generative AI from GAN to ChatGPT [63.58711128819828]
ChatGPT and other Generative AI (GAI) techniques belong to the category of Artificial Intelligence Generated Content (AIGC)
The goal of AIGC is to make the content creation process more efficient and accessible, allowing for the production of high-quality content at a faster pace.
arXiv Detail & Related papers (2023-03-07T20:36:13Z) - DIME: Fine-grained Interpretations of Multimodal Models via Disentangled
Local Explanations [119.1953397679783]
We focus on advancing the state-of-the-art in interpreting multimodal models.
Our proposed approach, DIME, enables accurate and fine-grained analysis of multimodal models.
arXiv Detail & Related papers (2022-03-03T20:52:47Z) - Human Action Recognition from Various Data Modalities: A Review [37.07491839026713]
Human Action Recognition (HAR) aims to understand human behavior and assign a label to each action.
HAR has a wide range of applications, and has been attracting increasing attention in the field of computer vision.
We present a survey of recent progress in deep learning methods for HAR based on the type of input data modality.
arXiv Detail & Related papers (2020-12-22T07:37:43Z) - Recent Progress in Appearance-based Action Recognition [73.6405863243707]
Action recognition is a task to identify various human actions in a video.
Recent appearance-based methods have achieved promising progress towards accurate action recognition.
arXiv Detail & Related papers (2020-11-25T10:18:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.