SketcherX: AI-Driven Interactive Robotic drawing with Diffusion model and Vectorization Techniques
- URL: http://arxiv.org/abs/2409.15292v1
- Date: Wed, 4 Sep 2024 02:20:22 GMT
- Title: SketcherX: AI-Driven Interactive Robotic drawing with Diffusion model and Vectorization Techniques
- Authors: Jookyung Song, Mookyoung Kang, Nojun Kwak,
- Abstract summary: We introduce SketcherX, a novel robotic system for personalized portrait drawing through interactive human-robot engagement.
Unlike traditional robotic art systems that rely on analog printing techniques, SketcherX captures and processes facial images to produce vectorized drawings in a distinctive, human-like artistic style.
- Score: 26.240518216121487
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: We introduce SketcherX, a novel robotic system for personalized portrait drawing through interactive human-robot engagement. Unlike traditional robotic art systems that rely on analog printing techniques, SketcherX captures and processes facial images to produce vectorized drawings in a distinctive, human-like artistic style. The system comprises two 6-axis robotic arms : a face robot, which is equipped with a head-mounted camera and Large Language Model (LLM) for real-time interaction, and a drawing robot, utilizing a fine-tuned Stable Diffusion model, ControlNet, and Vision-Language models for dynamic, stylized drawing. Our contributions include the development of a custom Vector Low Rank Adaptation model (LoRA), enabling seamless adaptation to various artistic styles, and integrating a pair-wise fine-tuning approach to enhance stroke quality and stylistic accuracy. Experimental results demonstrate the system's ability to produce high-quality, personalized portraits within two minutes, highlighting its potential as a new paradigm in robotic creativity. This work advances the field of robotic art by positioning robots as active participants in the creative process, paving the way for future explorations in interactive, human-robot artistic collaboration.
Related papers
- VidBot: Learning Generalizable 3D Actions from In-the-Wild 2D Human Videos for Zero-Shot Robotic Manipulation [53.63540587160549]
VidBot is a framework enabling zero-shot robotic manipulation using learned 3D affordance from in-the-wild monocular RGB-only human videos.
VidBot paves the way for leveraging everyday human videos to make robot learning more scalable.
arXiv Detail & Related papers (2025-03-10T10:04:58Z) - Differentiable Robot Rendering [45.23538293501457]
We introduce differentiable robot rendering, a method allowing the visual appearance of a robot body to be directly differentiable with respect to its control parameters.
We demonstrate its capability and usage in applications including reconstruction of robot poses from images and controlling robots through vision language models.
arXiv Detail & Related papers (2024-10-17T17:59:02Z) - Unifying 3D Representation and Control of Diverse Robots with a Single Camera [48.279199537720714]
We introduce Neural Jacobian Fields, an architecture that autonomously learns to model and control robots from vision alone.
Our approach achieves accurate closed-loop control and recovers the causal dynamic structure of each robot.
arXiv Detail & Related papers (2024-07-11T17:55:49Z) - LLARVA: Vision-Action Instruction Tuning Enhances Robot Learning [50.99807031490589]
We introduce LLARVA, a model trained with a novel instruction tuning method to unify a range of robotic learning tasks, scenarios, and environments.
We generate 8.5M image-visual trace pairs from the Open X-Embodiment dataset in order to pre-train our model.
Experiments yield strong performance, demonstrating that LLARVA performs well compared to several contemporary baselines.
arXiv Detail & Related papers (2024-06-17T17:55:29Z) - Choreographing the Digital Canvas: A Machine Learning Approach to Artistic Performance [9.218587190403174]
This paper introduces the concept of a design tool for artistic performances based on attribute descriptions.
The platform integrates a novel machine-learning (ML) model with an interactive interface to generate and visualize artistic movements.
arXiv Detail & Related papers (2024-03-26T01:42:13Z) - Towards Embedding Dynamic Personas in Interactive Robots: Masquerading Animated Social Kinematics (MASK) [10.351714893090964]
This paper presents the design and development of an innovative interactive robotic system to enhance audience engagement using character-like personas.
Built upon the foundations of persona-driven dialog agents, this work extends the agent's application to the physical realm, employing robots to provide a more captivating and interactive experience.
arXiv Detail & Related papers (2024-03-15T06:22:32Z) - Learning Orbitally Stable Systems for Diagrammatically Teaching [14.839036866911089]
Diagrammatic Teaching is a paradigm for robots to acquire novel skills, whereby the user provides 2D sketches over images of the scene to shape the robot's motion.
In this work, we tackle the problem of teaching a robot to approach a surface and then follow cyclic motion on it, where the cycle of the motion can be arbitrarily specified by a single user-provided sketch over an image from the robot's camera.
arXiv Detail & Related papers (2023-09-19T04:03:42Z) - Pathway to Future Symbiotic Creativity [76.20798455931603]
We propose a classification of the creative system with a hierarchy of 5 classes, showing the pathway of creativity evolving from a mimic-human artist to a Machine artist in its own right.
In art creation, it is necessary for machines to understand humans' mental states, including desires, appreciation, and emotions, humans also need to understand machines' creative capabilities and limitations.
We propose a novel framework for building future Machine artists, which comes with the philosophy that a human-compatible AI system should be based on the "human-in-the-loop" principle.
arXiv Detail & Related papers (2022-08-18T15:12:02Z) - Future Frame Prediction for Robot-assisted Surgery [57.18185972461453]
We propose a ternary prior guided variational autoencoder (TPG-VAE) model for future frame prediction in robotic surgical video sequences.
Besides content distribution, our model learns motion distribution, which is novel to handle the small movements of surgical tools.
arXiv Detail & Related papers (2021-03-18T15:12:06Z) - Making Robots Draw A Vivid Portrait In Two Minutes [11.148458054454407]
We present a drawing robot, which can automatically transfer a facial picture to a vivid portrait, and then draw it on paper within two minutes averagely.
At the heart of our system is a novel portrait synthesis algorithm based on deep learning.
The whole portrait drawing robotic system is named AiSketcher.
arXiv Detail & Related papers (2020-05-12T03:02:24Z) - Morphology-Agnostic Visual Robotic Control [76.44045983428701]
MAVRIC is an approach that works with minimal prior knowledge of the robot's morphology.
We demonstrate our method on visually-guided 3D point reaching, trajectory following, and robot-to-robot imitation.
arXiv Detail & Related papers (2019-12-31T15:45:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.