Exploring Latent Dimensions of Crowd-sourced Creativity
- URL: http://arxiv.org/abs/2112.06978v1
- Date: Mon, 13 Dec 2021 19:24:52 GMT
- Title: Exploring Latent Dimensions of Crowd-sourced Creativity
- Authors: Umut Kocasari, Alperen Bag, Efehan Atici and Pinar Yanardag
- Abstract summary: We build our work on the largest AI-based creativity platform, Artbreeder.
We explore the latent dimensions of images generated on this platform and present a novel framework for manipulating images to make them more creative.
- Score: 0.02294014185517203
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recently, the discovery of interpretable directions in the latent spaces of
pre-trained GANs has become a popular topic. While existing works mostly
consider directions for semantic image manipulations, we focus on an abstract
property: creativity. Can we manipulate an image to be more or less creative?
We build our work on the largest AI-based creativity platform, Artbreeder,
where users can generate images using pre-trained GAN models. We explore the
latent dimensions of images generated on this platform and present a novel
framework for manipulating images to make them more creative. Our code and
dataset are available at http://github.com/catlab-team/latentcreative.
Related papers
- PixWizard: Versatile Image-to-Image Visual Assistant with Open-Language Instructions [66.92809850624118]
PixWizard is an image-to-image visual assistant designed for image generation, manipulation, and translation based on free-from language instructions.
We tackle a variety of vision tasks into a unified image-text-to-image generation framework and curate an Omni Pixel-to-Pixel Instruction-Tuning dataset.
Our experiments demonstrate that PixWizard not only shows impressive generative and understanding abilities for images with diverse resolutions but also exhibits promising generalization capabilities with unseen tasks and human instructions.
arXiv Detail & Related papers (2024-09-23T17:59:46Z) - Re-Thinking Inverse Graphics With Large Language Models [51.333105116400205]
Inverse graphics -- inverting an image into physical variables that, when rendered, enable reproduction of the observed scene -- is a fundamental challenge in computer vision and graphics.
We propose the Inverse-Graphics Large Language Model (IG-LLM), an inversegraphics framework centered around an LLM.
We incorporate a frozen pre-trained visual encoder and a continuous numeric head to enable end-to-end training.
arXiv Detail & Related papers (2024-04-23T16:59:02Z) - Creative Agents: Empowering Agents with Imagination for Creative Tasks [31.920963353890393]
We propose a class of solutions for creative agents, where the controller is enhanced with an imaginator that generates detailed imaginations of task outcomes conditioned on language instructions.
We benchmark creative tasks with the challenging open-world game Minecraft, where the agents are asked to create diverse buildings given free-form language instructions.
We perform a detailed experimental analysis of creative agents, showing that creative agents are the first AI agents accomplishing diverse building creation in the survival mode of Minecraft.
arXiv Detail & Related papers (2023-12-05T06:00:52Z) - SketchDreamer: Interactive Text-Augmented Creative Sketch Ideation [111.2195741547517]
We present a method to generate controlled sketches using a text-conditioned diffusion model trained on pixel representations of images.
Our objective is to empower non-professional users to create sketches and, through a series of optimisation processes, transform a narrative into a storyboard.
arXiv Detail & Related papers (2023-08-27T19:44:44Z) - CLIP-CLOP: CLIP-Guided Collage and Photomontage [16.460669517251084]
We design a gradient-based generator to produce collages.
It requires the human artist to curate libraries of image patches and to describe (with prompts) the whole image composition.
We explore the aesthetic potentials of high-resolution collages, and provide an open-source Google Colab as an artistic tool.
arXiv Detail & Related papers (2022-05-06T11:33:49Z) - InvGAN: Invertible GANs [88.58338626299837]
InvGAN, short for Invertible GAN, successfully embeds real images to the latent space of a high quality generative model.
This allows us to perform image inpainting, merging, and online data augmentation.
arXiv Detail & Related papers (2021-12-08T21:39:00Z) - Telling Creative Stories Using Generative Visual Aids [52.623545341588304]
We asked writers to write creative stories from a starting prompt, and provided them with visuals created by generative AI models from the same prompt.
Compared to a control group, writers who used the visuals as story writing aid wrote significantly more creative, original, complete and visualizable stories.
Findings indicate that cross modality inputs by AI can benefit divergent aspects of creativity in human-AI co-creation, but hinders convergent thinking.
arXiv Detail & Related papers (2021-10-27T23:13:47Z) - The Intrinsic Dimension of Images and Its Impact on Learning [60.811039723427676]
It is widely believed that natural image data exhibits low-dimensional structure despite the high dimensionality of conventional pixel representations.
In this work, we apply dimension estimation tools to popular datasets and investigate the role of low-dimensional structure in deep learning.
arXiv Detail & Related papers (2021-04-18T16:29:23Z) - Creativity of Deep Learning: Conceptualization and Assessment [1.5738019181349994]
We use insights from computational creativity to conceptualize and assess current applications of generative deep learning in creative domains.
We highlight parallels between current systems and different models of human creativity as well as their shortcomings.
arXiv Detail & Related papers (2020-12-03T21:44:07Z) - A Framework and Dataset for Abstract Art Generation via CalligraphyGAN [0.0]
We present a creative framework based on Conditional Generative Adversarial Networks and Contextual Neural Language Model to generate abstract artworks.
Our work is inspired by Chinese calligraphy, which is a unique form of visual art where the character itself is an aesthetic painting.
arXiv Detail & Related papers (2020-12-02T16:24:20Z) - Words as Art Materials: Generating Paintings with Sequential GANs [8.249180979158815]
We investigate the generation of artistic images on a large variance dataset.
This dataset includes images with variations, for example, in shape, color, and content.
We propose a sequential Generative Adversarial Network model.
arXiv Detail & Related papers (2020-07-08T19:17:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.