IN2OUT: Fine-Tuning Video Inpainting Model for Video Outpainting Using Hierarchical Discriminator
- URL: http://arxiv.org/abs/2508.00418v1
- Date: Fri, 01 Aug 2025 08:15:14 GMT
- Title: IN2OUT: Fine-Tuning Video Inpainting Model for Video Outpainting Using Hierarchical Discriminator
- Authors: Sangwoo Youn, Minji Lee, Nokap Tony Park, Yeonggyoo Jeon, Taeyoung Na,
- Abstract summary: Video outpainting presents a unique challenge of extending the borders while maintaining consistency with the given content.<n>We develop a specialized outpainting loss function that leverages both local and global features of the discriminator.<n>Our proposed method outperforms state-of-the-art methods both quantitatively and qualitatively.
- Score: 3.6350564275444177
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Video outpainting presents a unique challenge of extending the borders while maintaining consistency with the given content. In this paper, we suggest the use of video inpainting models that excel in object flow learning and reconstruction in outpainting rather than solely generating the background as in existing methods. However, directly applying or fine-tuning inpainting models to outpainting has shown to be ineffective, often leading to blurry results. Our extensive experiments on discriminator designs reveal that a critical component missing in the outpainting fine-tuning process is a discriminator capable of effectively assessing the perceptual quality of the extended areas. To tackle this limitation, we differentiate the objectives of adversarial training into global and local goals and introduce a hierarchical discriminator that meets both objectives. Additionally, we develop a specialized outpainting loss function that leverages both local and global features of the discriminator. Fine-tuning on this adversarial loss function enhances the generator's ability to produce both visually appealing and globally coherent outpainted scenes. Our proposed method outperforms state-of-the-art methods both quantitatively and qualitatively. Supplementary materials including the demo video and the code are available in SigPort.
Related papers
- GuidPaint: Class-Guided Image Inpainting with Diffusion Models [1.1902474395094222]
We propose GuidPaint, a training-free, class-guided image inpainting framework.<n>We show that GuidPaint achieves clear improvements over existing context-aware inpainting methods in both qualitative and quantitative evaluations.
arXiv Detail & Related papers (2025-07-29T09:36:52Z) - Towards Seamless Borders: A Method for Mitigating Inconsistencies in Image Inpainting and Outpainting [22.46566055053259]
We propose two novel methods to address discrepancy issues in diffusion-based inpainting models.<n>First, we introduce a modified Variational Autoencoder that corrects color imbalances, ensuring that the final inpainted results are free of color mismatches.<n>Second, we propose a two-step training strategy that improves the blending of generated and existing image content during the diffusion process.
arXiv Detail & Related papers (2025-06-14T15:02:56Z) - RefFusion: Reference Adapted Diffusion Models for 3D Scene Inpainting [63.567363455092234]
RefFusion is a novel 3D inpainting method based on a multi-scale personalization of an image inpainting diffusion model to the given reference view.
Our framework achieves state-of-the-art results for object removal while maintaining high controllability.
arXiv Detail & Related papers (2024-04-16T17:50:02Z) - Be-Your-Outpainter: Mastering Video Outpainting through Input-Specific Adaptation [44.92712228326116]
Video outpainting is a challenging task, aiming at generating video content outside the viewport of the input video.
We introduce MOTIA Mastering Video Outpainting Through Input-Specific Adaptation.
MoTIA comprises two main phases: input-specific adaptation and pattern-aware outpainting.
arXiv Detail & Related papers (2024-03-20T16:53:45Z) - Towards Enhanced Image Inpainting: Mitigating Unwanted Object Insertion and Preserving Color Consistency [78.0488707697235]
Post-processing approach dubbed ASUKA (Aligned Stable inpainting with UnKnown Areas prior) to improve inpainting models.<n>Masked Auto-Encoder (MAE) for reconstruction-based priors mitigates object hallucination.<n> specialized VAE decoder that treats latent-to-image decoding as a local task.
arXiv Detail & Related papers (2023-12-08T05:08:06Z) - Structure-Guided Image Completion with Image-level and Object-level Semantic Discriminators [97.12135238534628]
We propose a learning paradigm that consists of semantic discriminators and object-level discriminators for improving the generation of complex semantics and objects.
Specifically, the semantic discriminators leverage pretrained visual features to improve the realism of the generated visual concepts.
Our proposed scheme significantly improves the generation quality and achieves state-of-the-art results on various tasks.
arXiv Detail & Related papers (2022-12-13T01:36:56Z) - Perceptual Artifacts Localization for Inpainting [60.5659086595901]
We propose a new learning task of automatic segmentation of inpainting perceptual artifacts.
We train advanced segmentation networks on a dataset to reliably localize inpainting artifacts within inpainted images.
We also propose a new evaluation metric called Perceptual Artifact Ratio (PAR), which is the ratio of objectionable inpainted regions to the entire inpainted area.
arXiv Detail & Related papers (2022-08-05T18:50:51Z) - Cylin-Painting: Seamless {360\textdegree} Panoramic Image Outpainting
and Beyond [136.18504104345453]
We present a Cylin-Painting framework that involves meaningful collaborations between inpainting and outpainting.
The proposed algorithm can be effectively extended to other panoramic vision tasks, such as object detection, depth estimation, and image super-resolution.
arXiv Detail & Related papers (2022-04-18T21:18:49Z) - In&Out : Diverse Image Outpainting via GAN Inversion [89.84841983778672]
Image outpainting seeks for a semantically consistent extension of the input image beyond its available content.
In this work, we formulate the problem from the perspective of inverting generative adversarial networks.
Our generator renders micro-patches conditioned on their joint latent code as well as their individual positions in the image.
arXiv Detail & Related papers (2021-04-01T17:59:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.