Related papers: VIDES: Virtual Interior Design via Natural Language and Visual Guidance

VIDES: Virtual Interior Design via Natural Language and Visual Guidance

URL: http://arxiv.org/abs/2308.13795v1
Date: Sat, 26 Aug 2023 07:41:42 GMT
Title: VIDES: Virtual Interior Design via Natural Language and Visual Guidance
Authors: Minh-Hien Le and Chi-Bien Chu and Khanh-Duy Le and Tam V. Nguyen and Minh-Triet Tran and Trung-Nghia Le
Abstract summary: We propose Virtual Interior DESign (VIDES) system in response to this challenge. Leveraging cutting-edge technology in generative AI, our system can assist users in generating and editing indoor scene concepts.
Score: 16.35842298296878
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Interior design is crucial in creating aesthetically pleasing and functional indoor spaces. However, developing and editing interior design concepts requires significant time and expertise. We propose Virtual Interior DESign (VIDES) system in response to this challenge. Leveraging cutting-edge technology in generative AI, our system can assist users in generating and editing indoor scene concepts quickly, given user text description and visual guidance. Using both visual guidance and language as the conditional inputs significantly enhances the accuracy and coherence of the generated scenes, resulting in visually appealing designs. Through extensive experimentation, we demonstrate the effectiveness of VIDES in developing new indoor concepts, changing indoor styles, and replacing and removing interior objects. The system successfully captures the essence of users' descriptions while providing flexibility for customization. Consequently, this system can potentially reduce the entry barrier for indoor design, making it more accessible to users with limited technical skills and reducing the time required to create high-quality images. Individuals who have a background in design can now easily communicate their ideas visually and effectively present their design concepts. https://sites.google.com/view/ltnghia/research/VIDES

Related papers

DesignDiffusion: High-Quality Text-to-Design Image Generation with Diffusion Models [115.62816053600085]
We present DesignDiffusion, a framework for synthesizing design images from textual descriptions. The proposed framework directly synthesizes textual and visual design elements from user prompts. It utilizes a distinctive character embedding derived from the visual text to enhance the input prompt.
arXiv Detail & Related papers (2025-03-03T15:22:57Z)
DiffDesign: Controllable Diffusion with Meta Prior for Efficient Interior Design Generation [25.532400438564334]
We propose DiffDesign, a controllable diffusion model with meta priors for efficient interior design generation. Specifically, we utilize the generative priors of a 2D diffusion model pre-trained on a large image dataset as our rendering backbone. We further guide the denoising process by disentangling cross-attention control over design attributes, such as appearance, pose, and size, and introduce an optimal transfer-based alignment module to enforce view consistency.
arXiv Detail & Related papers (2024-11-25T11:36:34Z)
MagicTailor: Component-Controllable Personalization in Text-to-Image Diffusion Models [51.1034358143232]
We introduce component-controllable personalization, a novel task that pushes the boundaries of text-to-image (T2I) models. To overcome these challenges, we design MagicTailor, an innovative framework that leverages Dynamic Masked Degradation (DM-Deg) to dynamically perturb undesired visual semantics.
arXiv Detail & Related papers (2024-10-17T09:22:53Z)
Inspired by AI? A Novel Generative AI System To Assist Conceptual Automotive Design [6.001793288867721]
Design inspiration is crucial for establishing the direction of a design as well as evoking feelings and conveying meanings during the conceptual design process. Many practice designers use text-based searches on platforms like Pinterest to gather image ideas, followed by sketching on paper or using digital tools to develop concepts. Emerging generative AI techniques, such as diffusion models, offer a promising avenue to streamline these processes by swiftly generating design concepts based on text and image inspiration inputs.
arXiv Detail & Related papers (2024-06-06T17:04:14Z)
I-Design: Personalized LLM Interior Designer [57.00412237555167]
I-Design is a personalized interior designer that allows users to generate and visualize their design goals through natural language communication. I-Design starts with a team of large language model agents that engage in dialogues and logical reasoning with one another. The final design is then constructed in 3D by retrieving and integrating assets from an existing object database.
arXiv Detail & Related papers (2024-04-03T16:17:53Z)
MyVLM: Personalizing VLMs for User-Specific Queries [78.33252556805931]
We take a first step toward the personalization of vision-language models, enabling them to learn and reason over user-provided concepts. To effectively recognize a variety of user-specific concepts, we augment the VLM with external concept heads that function as toggles for the model. Having recognized the concept, we learn a new concept embedding in the intermediate feature space of the VLM. This embedding is tasked with guiding the language model to naturally integrate the target concept in its generated response.
arXiv Detail & Related papers (2024-03-21T17:51:01Z)
DressCode: Autoregressively Sewing and Generating Garments from Text Guidance [61.48120090970027]
DressCode aims to democratize design for novices and offer immense potential in fashion design, virtual try-on, and digital human creation. We first introduce SewingGPT, a GPT-based architecture integrating cross-attention with text-conditioned embedding to generate sewing patterns. We then tailor a pre-trained Stable Diffusion to generate tile-based Physically-based Rendering (PBR) textures for the garments.
arXiv Detail & Related papers (2024-01-29T16:24:21Z)
VISHIEN-MAAT: Scrollytelling visualization design for explaining Siamese Neural Network concept to non-technical users [8.939421900877742]
This work proposes a novel visualization design for creating a scrollytelling that can effectively explain an AI concept to non-technical users. Our approach helps create a visualization valuable for a short-timeline situation like a sales pitch.
arXiv Detail & Related papers (2023-04-04T08:26:54Z)
Towards Counterfactual Image Manipulation via CLIP [106.94502632502194]
Existing methods can achieve realistic editing of different visual attributes such as age and gender of facial images. We investigate this problem in a text-driven manner with Contrastive-Language-Image-Pretraining (CLIP) We design a novel contrastive loss that exploits predefined CLIP-space directions to guide the editing toward desired directions from different perspectives.
arXiv Detail & Related papers (2022-07-06T17:02:25Z)
Learning-based pose edition for efficient and interactive design [55.41644538483948]
In computer-aided animation artists define the key poses of a character by manipulating its skeletons. Character pose must respect many ill-defined constraints, and so the resulting realism greatly depends on the animator's skill and knowledge. We describe an efficient tool for pose design, allowing users to intuitively manipulate a pose to create character animations.
arXiv Detail & Related papers (2021-07-01T12:15:02Z)
DeFINE: Delayed Feedback based Immersive Navigation Environment for Studying Goal-Directed Human Navigation [10.7197371210731]
Delayed Feedback based Immersive Navigation Environment (DeFINE) is a framework that allows for easy creation and administration of navigation tasks. DeFINE has a built-in capability to provide performance feedback to participants during an experiment.
arXiv Detail & Related papers (2020-03-06T11:00:12Z)

This list is automatically generated from the titles and abstracts of the papers in this site.