Related papers: Castle in the Sky: Dynamic Sky Replacement and Harmonization in Videos

Related papers

Towards Physically-Based Sky-Modeling For Image Based Lighting [0.0]
Environment maps are a key component for rendering photorealistic outdoor scenes with coherent illumination.<n>Recent works have extended sky-models to be more comprehensive and inclusive of cloud formations but, as we demonstrate, existing methods fall short in faithfully recreating natural skies.<n>We propose AllSky, a flexible all-weather sky-model learned directly from physically captured HDRI.
arXiv Detail & Related papers (2025-12-15T16:44:38Z)
RealisMotion: Decomposed Human Motion Control and Video Generation in the World Space [28.70181587812075]
We propose a framework that explicitly decouples motion from appearance, from background, and action from trajectory.<n>Our method achieves state-of-the-art performance on both element-wise controllability and overall video quality.
arXiv Detail & Related papers (2025-08-12T03:02:23Z)
S^2VG: 3D Stereoscopic and Spatial Video Generation via Denoising Frame Matrix [60.060882467801484]
We present a pose-free and training-free method that leverages an off-the-shelf monocular video generation model to produce immersive 3D videos.<n>Our approach first warps the generated monocular video into pre-defined camera viewpoints using estimated depth information, then applies a novel textitframe matrix inpainting framework.<n>We validate the efficacy of our proposed method by conducting experiments on videos from various generative models, such as Sora, Lumiere, WALT, and Zeroscope.
arXiv Detail & Related papers (2025-08-11T14:50:03Z)
Generating Fit Check Videos with a Handheld Camera [21.020454186769655]
We propose a more convenient solution that enables full-body video capture using handheld mobile devices.<n>Our approach takes as input two static photos (front and back) of you in a mirror, along with an IMU motion reference that you perform while holding your mobile phone.<n>We enable rendering into a new scene, with consistent illumination and shadows.
arXiv Detail & Related papers (2025-05-29T17:58:49Z)
Controllable Weather Synthesis and Removal with Video Diffusion Models [61.56193902622901]
WeatherWeaver is a video diffusion model that synthesizes diverse weather effects directly into any input video. Our model provides precise control over weather effect intensity and supports blending various weather types, ensuring both realism and adaptability.
arXiv Detail & Related papers (2025-05-01T17:59:57Z)
ReCamMaster: Camera-Controlled Generative Rendering from A Single Video [72.42376733537925]
ReCamMaster is a camera-controlled generative video re-rendering framework. It reproduces the dynamic scene of an input video at novel camera trajectories. Our method also finds promising applications in video stabilization, super-resolution, and outpainting.
arXiv Detail & Related papers (2025-03-14T17:59:31Z)
Learning Camera Movement Control from Real-World Drone Videos [25.10006841389459]
Existing AI videography methods struggle with limited appearance diversity in simulation training. We propose a scalable method that involves collecting real-world training data. We show that our system effectively learns to perform challenging camera movements.
arXiv Detail & Related papers (2024-12-12T18:59:54Z)
MegaSaM: Accurate, Fast, and Robust Structure and Motion from Casual Dynamic Videos [104.1338295060383]
We present a system that allows for accurate, fast, and robust estimation of camera parameters and depth maps from casual monocular videos of dynamic scenes. Our system is significantly more accurate and robust at camera pose and depth estimation when compared with prior and concurrent work.
arXiv Detail & Related papers (2024-12-05T18:59:42Z)
Replace Anyone in Videos [39.4019337319795]
We propose the ReplaceAnyone framework, which focuses on localizing and manipulating human motion in videos. Specifically, we formulate this task as an image-conditioned pose-driven video inpainting paradigm. We introduce diverse mask forms involving regular and irregular shapes to avoid shape leakage and allow granular local control.
arXiv Detail & Related papers (2024-09-30T03:27:33Z)
WildVidFit: Video Virtual Try-On in the Wild via Image-Based Controlled Diffusion Models [132.77237314239025]
Video virtual try-on aims to generate realistic sequences that maintain garment identity and adapt to a person's pose and body shape in source videos. Traditional image-based methods, relying on warping and blending, struggle with complex human movements and occlusions. We reconceptualize video try-on as a process of generating videos conditioned on garment descriptions and human motion. Our solution, WildVidFit, employs image-based controlled diffusion models for a streamlined, one-stage approach.
arXiv Detail & Related papers (2024-07-15T11:21:03Z)
SVG: 3D Stereoscopic Video Generation via Denoising Frame Matrix [60.48666051245761]
We propose a pose-free and training-free approach for generating 3D stereoscopic videos. Our method warps a generated monocular video into camera views on stereoscopic baseline using estimated video depth. We develop a disocclusion boundary re-injection scheme that further improves the quality of video inpainting.
arXiv Detail & Related papers (2024-06-29T08:33:55Z)
Image Conductor: Precision Control for Interactive Video Synthesis [90.2353794019393]
Filmmaking and animation production often require sophisticated techniques for coordinating camera transitions and object movements. Image Conductor is a method for precise control of camera transitions and object movements to generate video assets from a single image.
arXiv Detail & Related papers (2024-06-21T17:55:05Z)
Generative Rendering: Controllable 4D-Guided Video Generation with 2D Diffusion Models [40.71940056121056]
We present a novel approach that combines the controllability of dynamic 3D meshes with the expressivity and editability of emerging diffusion models. We demonstrate our approach on various examples where motion can be obtained by animating rigged assets or changing the camera path.
arXiv Detail & Related papers (2023-12-03T14:17:11Z)
Learning to Act from Actionless Videos through Dense Correspondences [87.1243107115642]
We present an approach to construct a video-based robot policy capable of reliably executing diverse tasks across different robots and environments. Our method leverages images as a task-agnostic representation, encoding both the state and action information, and text as a general representation for specifying robot goals. We demonstrate the efficacy of our approach in learning policies on table-top manipulation and navigation tasks.
arXiv Detail & Related papers (2023-10-12T17:59:23Z)
DynIBaR: Neural Dynamic Image-Based Rendering [79.44655794967741]
We address the problem of synthesizing novel views from a monocular video depicting a complex dynamic scene. We adopt a volumetric image-based rendering framework that synthesizes new viewpoints by aggregating features from nearby views. We demonstrate significant improvements over state-of-the-art methods on dynamic scene datasets.
arXiv Detail & Related papers (2022-11-20T20:57:02Z)
Low Light Video Enhancement by Learning on Static Videos with Cross-Frame Attention [10.119600046984088]
We develop a deep learning method for low light video enhancement by training a model on static videos. Existing methods operate frame by frame and do not exploit the relationships among neighbouring frames. We show that our method outperforms other state-of-the-art video enhancement algorithms when trained only on static videos.
arXiv Detail & Related papers (2022-10-09T15:49:46Z)
Playable Environments: Video Manipulation in Space and Time [98.0621309257937]
We present Playable Environments - a new representation for interactive video generation and manipulation in space and time. With a single image at inference time, our novel framework allows the user to move objects in 3D while generating a video by providing a sequence of desired actions. Our method builds an environment state for each frame, which can be manipulated by our proposed action module and decoded back to the image space with volumetric rendering.
arXiv Detail & Related papers (2022-03-03T18:51:05Z)
Sky Optimization: Semantically aware image processing of skies in low-light photography [26.37385679374474]
We propose an automated method, which can run as a part of a camera pipeline, for creating accurate sky alpha-masks. Our method performs end-to-end sky optimization in less than half a second per image on a mobile device.
arXiv Detail & Related papers (2020-06-15T20:19:12Z)

This list is automatically generated from the titles and abstracts of the papers in this site.