Related papers: Jigsaw3D: Disentangled 3D Style Transfer via Patch Shuffling and Masking

Jigsaw3D: Disentangled 3D Style Transfer via Patch Shuffling and Masking

URL: http://arxiv.org/abs/2510.10497v1
Date: Sun, 12 Oct 2025 08:22:57 GMT
Title: Jigsaw3D: Disentangled 3D Style Transfer via Patch Shuffling and Masking
Authors: Yuteng Ye, Zheng Zhang, Qinchuan Zhang, Di Wang, Youjia Zhang, Wenxiao Zhang, Wei Yang, Yuan Liu,
Abstract summary: Controllable 3D style transfer seeks to restyle a 3D asset so that its textures match a reference image while preserving the integrity and multi-view consistency.<n>We introduce Jigsaw3D, a multi-view diffusion based pipeline that decouples style from content and enables fast, view-consistent stylization.
Score: 22.27602596205736
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Controllable 3D style transfer seeks to restyle a 3D asset so that its textures match a reference image while preserving the integrity and multi-view consistency. The prevalent methods either rely on direct reference style token injection or score-distillation from 2D diffusion models, which incurs heavy per-scene optimization and often entangles style with semantic content. We introduce Jigsaw3D, a multi-view diffusion based pipeline that decouples style from content and enables fast, view-consistent stylization. Our key idea is to leverage the jigsaw operation - spatial shuffling and random masking of reference patches - to suppress object semantics and isolate stylistic statistics (color palettes, strokes, textures). We integrate these style cues into a multi-view diffusion model via reference-to-view cross-attention, producing view-consistent stylized renderings conditioned on the input mesh. The renders are then style-baked onto the surface to yield seamless textures. Across standard 3D stylization benchmarks, Jigsaw3D achieves high style fidelity and multi-view consistency with substantially lower latency, and generalizes to masked partial reference stylization, multi-object scene styling, and tileable texture generation. Project page is available at: https://babahui.github.io/jigsaw3D.github.io/

Related papers

DiffStyle3D: Consistent 3D Gaussian Stylization via Attention Optimization [22.652699040654046]
3D style transfer enables the creation of visually expressive 3D content.<n>We propose DiffStyle3D, a novel diffusion-based paradigm for 3DGS style transfer.<n>We show that DiffStyle3D outperforms state-of-the-art methods, achieving higher stylization quality and visual realism.
arXiv Detail & Related papers (2026-01-27T15:41:11Z)
SSGaussian: Semantic-Aware and Structure-Preserving 3D Style Transfer [57.723850794113055]
We propose a novel 3D style transfer pipeline that integrates prior knowledge from pretrained 2D diffusion models.<n>Our pipeline consists of two key stages: First, we leverage diffusion priors to generate stylized renderings of key viewpoints.<n>The second is instance-level style transfer, which effectively leverages instance-level consistency across stylized key views and transfers it onto the 3D representation.
arXiv Detail & Related papers (2025-09-04T16:40:44Z)
ReStyle3D: Scene-Level Appearance Transfer with Semantic Correspondences [33.06053818091165]
ReStyle3D is a framework for scene-level appearance transfer from a single style image to a real-world scene represented by multiple views.<n>It combines explicit semantic correspondences with multi-view consistency to achieve precise and coherent stylization.<n>Our code, pretrained models, and dataset will be publicly released to support new applications in interior design, virtual staging, and 3D-consistent stylization.
arXiv Detail & Related papers (2025-02-14T18:54:21Z)
Style3D: Attention-guided Multi-view Style Transfer for 3D Object Generation [9.212876623996475]
Style3D is a novel approach for generating stylized 3D objects from a content image and a style image.<n>By establishing an interplay between structural and stylistic features across multiple views, our approach enables a holistic 3D stylization process.
arXiv Detail & Related papers (2024-12-04T18:59:38Z)
Tex4D: Zero-shot 4D Scene Texturing with Video Diffusion Models [54.35214051961381]
3D meshes are widely used in computer vision and graphics for their efficiency in animation and minimal memory use in movies, games, AR, and VR.<n>However, creating temporal consistent and realistic textures for mesh remains labor-intensive for professional artists.<n>We present 3D Tex sequences that integrates inherent geometry from mesh sequences with video diffusion models to produce consistent textures.
arXiv Detail & Related papers (2024-10-14T17:59:59Z)
DreamMesh: Jointly Manipulating and Texturing Triangle Meshes for Text-to-3D Generation [149.77077125310805]
We present DreamMesh, a novel text-to-3D architecture that pivots on well-defined surfaces (triangle meshes) to generate high-fidelity explicit 3D model. In the coarse stage, the mesh is first deformed by text-guided Jacobians and then DreamMesh textures the mesh with an interlaced use of 2D diffusion models. In the fine stage, DreamMesh jointly manipulates the mesh and refines the texture map, leading to high-quality triangle meshes with high-fidelity textured materials.
arXiv Detail & Related papers (2024-09-11T17:59:02Z)
StyleGaussian: Instant 3D Style Transfer with Gaussian Splatting [141.05924680451804]
StyleGaussian is a novel 3D style transfer technique. It allows instant transfer of any image's style to a 3D scene at 10 frames per second (fps)
arXiv Detail & Related papers (2024-03-12T16:44:52Z)
3DStyle-Diffusion: Pursuing Fine-grained Text-driven 3D Stylization with 2D Diffusion Models [102.75875255071246]
3D content creation via text-driven stylization has played a fundamental challenge to multimedia and graphics community. We propose a new 3DStyle-Diffusion model that triggers fine-grained stylization of 3D meshes with additional controllable appearance and geometric guidance from 2D Diffusion models.
arXiv Detail & Related papers (2023-11-09T15:51:27Z)
ARTIC3D: Learning Robust Articulated 3D Shapes from Noisy Web Image Collections [71.46546520120162]
Estimating 3D articulated shapes like animal bodies from monocular images is inherently challenging. We propose ARTIC3D, a self-supervised framework to reconstruct per-instance 3D shapes from a sparse image collection in-the-wild. We produce realistic animations by fine-tuning the rendered shape and texture under rigid part transformations.
arXiv Detail & Related papers (2023-06-07T17:47:50Z)
StyleMesh: Style Transfer for Indoor 3D Scene Reconstructions [11.153966202832933]
We apply style transfer on mesh reconstructions of indoor scenes. This enables VR applications like experiencing 3D environments painted in the style of a favorite artist.
arXiv Detail & Related papers (2021-12-02T18:59:59Z)

This list is automatically generated from the titles and abstracts of the papers in this site.