Negative Shanshui: Real-time Interactive Ink Painting Synthesis
- URL: http://arxiv.org/abs/2508.16612v2
- Date: Sun, 05 Oct 2025 09:19:33 GMT
- Title: Negative Shanshui: Real-time Interactive Ink Painting Synthesis
- Authors: Aven-Le Zhou,
- Abstract summary: This paper presents Negative Shanshui, a real-time interactive AI synthesis approach that reinterprets classical Chinese landscape ink painting, i.e., shanshui, to engage with ecological crises in the Anthropocene.<n>Negative Shanshui optimize a fine-tuned Stable model for real-time inferences and integrates it with gaze-driven inpainting, frame Diffusion.<n>It enables dynamic morphing animations in response to the viewer's gaze and presents as an interactive virtual reality (VR) experience.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: This paper presents Negative Shanshui, a real-time interactive AI synthesis approach that reinterprets classical Chinese landscape ink painting, i.e., shanshui, to engage with ecological crises in the Anthropocene. Negative Shanshui optimizes a fine-tuned Stable Diffusion model for real-time inferences and integrates it with gaze-driven inpainting, frame interpolation; it enables dynamic morphing animations in response to the viewer's gaze and presents as an interactive virtual reality (VR) experience. The paper describes the complete technical pipeline, covering the system framework, optimization strategies, gaze-based interaction, and multimodal deployment in an art festival. Further analysis of audience feedback collected during its public exhibition highlights how participants variously engaged with the work through empathy, ambivalence, and critical reflection.
Related papers
- Avatar Forcing: Real-Time Interactive Head Avatar Generation for Natural Conversation [71.38488610271247]
Talking head generation creates lifelike avatars from static portraits for virtual communication and content creation.<n>Current models do not yet convey the feeling of truly interactive communication, often generating one-way responses that lack emotional engagement.<n>We propose Avatar Forcing, a new framework for interactive head avatar generation that models real-time user-avatar interactions through diffusion forcing.
arXiv Detail & Related papers (2026-01-02T11:58:48Z) - MoReact: Generating Reactive Motion from Textual Descriptions [57.642436102978245]
MoReact is a diffusion-based method designed to disentangle the generation of global trajectories and local motions sequentially.<n>Our experiments, utilizing data adapted from a two-person motion dataset, demonstrate the efficacy of our approach.
arXiv Detail & Related papers (2025-09-28T14:31:41Z) - Real-Time Intuitive AI Drawing System for Collaboration: Enhancing Human Creativity through Formal and Contextual Intent Integration [26.920087528015205]
This paper presents a real-time generative drawing system that interprets and integrates both formal intent and contextual intent.<n>The system achieves low-latency, two-stage transformation while supporting multi-user collaboration on shared canvases.
arXiv Detail & Related papers (2025-08-12T01:34:23Z) - HUMOF: Human Motion Forecasting in Interactive Social Scenes [29.621970821619424]
Complex scenes present significant challenges for predicting human behaviour due to the abundance of interaction information.<n>We propose an effective method for human motion forecasting in interactive scenes.<n>Our method achieves state-of-the-art performance across four public datasets.
arXiv Detail & Related papers (2025-06-04T09:21:54Z) - AsynFusion: Towards Asynchronous Latent Consistency Models for Decoupled Whole-Body Audio-Driven Avatars [65.53676584955686]
Whole-body audio-driven avatar pose and expression generation is a critical task for creating lifelike digital humans.<n>We propose AsynFusion, a novel framework that leverages diffusion transformers to achieve cohesive expression and gesture synthesis.<n>AsynFusion achieves state-of-the-art performance in generating real-time, synchronized whole-body animations.
arXiv Detail & Related papers (2025-05-21T03:28:53Z) - Every Painting Awakened: A Training-free Framework for Painting-to-Animation Generation [25.834500552609136]
We introduce a training-free framework specifically designed to bring real-world static paintings to life through image-to-video (I2V) synthesis.<n>Existing I2V methods, primarily trained on natural video datasets, often struggle to generate dynamic outputs from static paintings.<n>Our framework enables plug-and-play integration with existing I2V methods, making it an ideal solution for animating real-world paintings.
arXiv Detail & Related papers (2025-03-31T05:25:49Z) - Large Model Empowered Metaverse: State-of-the-Art, Challenges and Opportunities [28.81101395387858]
The Metaverse is an immersive, persistent digital ecosystem where users can interact, socialize, and work within 3D virtual environments.<n>This paper investigates the integration of large models within the Metaverse.<n>We propose a generative AI-based framework for optimizing Metaverse rendering.
arXiv Detail & Related papers (2025-01-18T13:52:48Z) - From Audio to Photoreal Embodiment: Synthesizing Humans in Conversations [107.88375243135579]
Given speech audio, we output multiple possibilities of gestural motion for an individual, including face, body, and hands.
We visualize the generated motion using highly photorealistic avatars that can express crucial nuances in gestures.
Experiments show our model generates appropriate and diverse gestures, outperforming both diffusion- and VQ-only methods.
arXiv Detail & Related papers (2024-01-03T18:55:16Z) - Consistent View Synthesis with Pose-Guided Diffusion Models [51.37925069307313]
Novel view synthesis from a single image has been a cornerstone problem for many Virtual Reality applications.
We propose a pose-guided diffusion model to generate a consistent long-term video of novel views from a single image.
arXiv Detail & Related papers (2023-03-30T17:59:22Z) - HORIZON: High-Resolution Semantically Controlled Panorama Synthesis [105.55531244750019]
Panorama synthesis endeavors to craft captivating 360-degree visual landscapes, immersing users in the heart of virtual worlds.
Recent breakthroughs in visual synthesis have unlocked the potential for semantic control in 2D flat images, but a direct application of these methods to panorama synthesis yields distorted content.
We unveil an innovative framework for generating high-resolution panoramas, adeptly addressing the issues of spherical distortion and edge discontinuity through sophisticated spherical modeling.
arXiv Detail & Related papers (2022-10-10T09:43:26Z) - On the Real-World Adversarial Robustness of Real-Time Semantic
Segmentation Models for Autonomous Driving [59.33715889581687]
The existence of real-world adversarial examples (commonly in the form of patches) poses a serious threat for the use of deep learning models in safety-critical computer vision tasks.
This paper presents an evaluation of the robustness of semantic segmentation models when attacked with different types of adversarial patches.
A novel loss function is proposed to improve the capabilities of attackers in inducing a misclassification of pixels.
arXiv Detail & Related papers (2022-01-05T22:33:43Z) - VIRT: Improving Representation-based Models for Text Matching through
Virtual Interaction [50.986371459817256]
We propose a novel textitVirtual InteRacTion mechanism, termed as VIRT, to enable full and deep interaction modeling in representation-based models.
VIRT asks representation-based encoders to conduct virtual interactions to mimic the behaviors as interaction-based models do.
arXiv Detail & Related papers (2021-12-08T09:49:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.