Disentangling Regional Primitives for Image Generation
- URL: http://arxiv.org/abs/2410.04421v3
- Date: Sun, 28 Sep 2025 03:49:42 GMT
- Title: Disentangling Regional Primitives for Image Generation
- Authors: Zhengting Chen, Lei Cheng, Lianghui Ding, Liang Lin, Quanshi Zhang,
- Abstract summary: This paper explains a neural network for image generation from a new perspective.<n>We propose a set of desirable properties to define the representation structure of a neural network for image generation.
- Score: 62.230722004629314
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper explains a neural network for image generation from a new perspective, i.e., explaining representation structures for image generation. We propose a set of desirable properties to define the representation structure of a neural network for image generation, including feature completeness, spatial boundedness and consistency. These properties enable us to propose a method for disentangling primitive feature components from the intermediate-layer features, where each feature component generates a primitive regional pattern covering multiple image patches. In this way, the generation of the entire image can be explained as a superposition of these feature components. We prove that these feature components, which satisfy the feature completeness property and the linear additivity property (derived from the feature completeness, spatial boundedness, and consistency properties), can be computed as OR Harsanyi interaction. Experiments have verified the faithfulness of the disentangled primitive regional patterns.
Related papers
- Topology-Guaranteed Image Segmentation: Enforcing Connectivity, Genus, and Width Constraints [7.086342996453705]
We propose a novel mathematical framework that explicitly integrates width information into the characterization of topological structures.<n>We are able to design neural networks that can segment images with the required topological and width properties.
arXiv Detail & Related papers (2026-01-16T16:29:48Z) - IntrinsiX: High-Quality PBR Generation using Image Priors [49.90007540430264]
We introduce IntrinsiX, a novel method that generates high-quality intrinsic images from text description.
In contrast to existing text-to-image models whose outputs contain baked-in scene lighting, our approach predicts physically-based rendering (PBR) maps.
arXiv Detail & Related papers (2025-04-01T17:47:48Z) - ART: Anonymous Region Transformer for Variable Multi-Layer Transparent Image Generation [108.69315278353932]
We introduce the Anonymous Region Transformer (ART), which facilitates the direct generation of variable multi-layer transparent images.
By enabling precise control and scalable layer generation, ART establishes a new paradigm for interactive content creation.
arXiv Detail & Related papers (2025-02-25T16:57:04Z) - Composing Parts for Expressive Object Generation [37.791770942390485]
We introduce PartComposer, a training-free method that enables image generation based on fine-grained part-level attributes.<n>PartComposer localizes object parts by denoising the object region from a specific diffusion process.<n>We run a localized diffusion process in each part region based on fine-grained part attributes and combine them to produce the final image.
arXiv Detail & Related papers (2024-06-14T17:31:29Z) - Parallel Backpropagation for Shared-Feature Visualization [36.31730251757713]
Recent work has shown that some out-of-category stimuli also activate neurons in high-level visual brain regions.
This may be due to visual features common among the preferred class also being present in other images.
Here, we propose a deep-learning-based approach for visualizing these features.
arXiv Detail & Related papers (2024-05-16T05:56:03Z) - Feature Accentuation: Revealing 'What' Features Respond to in Natural Images [4.4273123155989715]
We introduce a new method to the interpretability tool-kit, 'feature accentuation', which is capable of conveying both where and what in arbitrary input images induces a feature's response.
We find a particular combination of parameterization, augmentation, and regularization yields naturalistic visualizations that resemble the seed image and target feature simultaneously.
We make our precise implementation of feature accentuation available to the community as the Faccent library, an extension of Lucent.
arXiv Detail & Related papers (2024-02-15T16:01:59Z) - Image segmentation with traveling waves in an exactly solvable recurrent
neural network [71.74150501418039]
We show that a recurrent neural network can effectively divide an image into groups according to a scene's structural characteristics.
We present a precise description of the mechanism underlying object segmentation in this network.
We then demonstrate a simple algorithm for object segmentation that generalizes across inputs ranging from simple geometric objects in grayscale images to natural images.
arXiv Detail & Related papers (2023-11-28T16:46:44Z) - Using a Conditional Generative Adversarial Network to Control the
Statistical Characteristics of Generated Images for IACT Data Analysis [55.41644538483948]
We divide images into several classes according to the value of some property of the image, and then specify the required class when generating new images.
In the case of images from Imaging Atmospheric Cherenkov Telescopes (IACTs), an important property is the total brightness of all image pixels (image size)
We used a cGAN technique to generate images similar to whose obtained in the TAIGA-IACT experiment.
arXiv Detail & Related papers (2022-11-28T22:30:33Z) - ISG: I can See Your Gene Expression [13.148183268830879]
This paper aims to predict gene expression from a histology slide image precisely.
Such a slide image has a large resolution and sparsely distributed textures.
Existing gene expression methods mainly use general components to filter textureless regions, extract features, and aggregate features uniformly across regions.
We present ISG framework that harnesses interactions among discriminative features from texture-abundant regions by three new modules.
arXiv Detail & Related papers (2022-10-30T02:49:37Z) - Single Image Super-Resolution via a Dual Interactive Implicit Neural
Network [5.331665215168209]
We introduce a novel implicit neural network for the task of single image super-resolution at arbitrary scale factors.
We demonstrate the efficacy and flexibility of our approach against the state of the art on publicly available benchmark datasets.
arXiv Detail & Related papers (2022-10-23T02:05:19Z) - Joint Learning of Deep Texture and High-Frequency Features for
Computer-Generated Image Detection [24.098604827919203]
We propose a joint learning strategy with deep texture and high-frequency features for CG image detection.
A semantic segmentation map is generated to guide the affine transformation operation.
The combination of the original image and the high-frequency components of the original and rendered images are fed into a multi-branch neural network equipped with attention mechanisms.
arXiv Detail & Related papers (2022-09-07T17:30:40Z) - Hierarchical Semantic Regularization of Latent Spaces in StyleGANs [53.98170188547775]
We propose a Hierarchical Semantic Regularizer (HSR) which aligns the hierarchical representations learnt by the generator to corresponding powerful features learnt by pretrained networks on large amounts of data.
HSR is shown to not only improve generator representations but also the linearity and smoothness of the latent style spaces, leading to the generation of more natural-looking style-edited images.
arXiv Detail & Related papers (2022-08-07T16:23:33Z) - Attribute Prototype Network for Any-Shot Learning [113.50220968583353]
We argue that an image representation with integrated attribute localization ability would be beneficial for any-shot, i.e. zero-shot and few-shot, image classification tasks.
We propose a novel representation learning framework that jointly learns global and local features using only class-level attributes.
arXiv Detail & Related papers (2022-04-04T02:25:40Z) - A Unified Architecture of Semantic Segmentation and Hierarchical
Generative Adversarial Networks for Expression Manipulation [52.911307452212256]
We develop a unified architecture of semantic segmentation and hierarchical GANs.
A unique advantage of our framework is that on forward pass the semantic segmentation network conditions the generative model.
We evaluate our method on two challenging facial expression translation benchmarks, AffectNet and RaFD, and a semantic segmentation benchmark, CelebAMask-HQ.
arXiv Detail & Related papers (2021-12-08T22:06:31Z) - Composition and Style Attributes Guided Image Aesthetic Assessment [66.60253358722538]
We propose a method for the automatic prediction of the aesthetics of an image.
The proposed network includes: a pre-trained network for semantic features extraction (the Backbone); a Multi Layer Perceptron (MLP) network that relies on the Backbone features for the prediction of image attributes (the AttributeNet)
Given an image, the proposed multi-network is able to predict: style and composition attributes, and aesthetic score distribution.
arXiv Detail & Related papers (2021-11-08T17:16:38Z) - MOGAN: Morphologic-structure-aware Generative Learning from a Single
Image [59.59698650663925]
Recently proposed generative models complete training based on only one image.
We introduce a MOrphologic-structure-aware Generative Adversarial Network named MOGAN that produces random samples with diverse appearances.
Our approach focuses on internal features including the maintenance of rational structures and variation on appearance.
arXiv Detail & Related papers (2021-03-04T12:45:23Z) - DoFE: Domain-oriented Feature Embedding for Generalizable Fundus Image
Segmentation on Unseen Datasets [96.92018649136217]
We present a novel Domain-oriented Feature Embedding (DoFE) framework to improve the generalization ability of CNNs on unseen target domains.
Our DoFE framework dynamically enriches the image features with additional domain prior knowledge learned from multi-source domains.
Our framework generates satisfying segmentation results on unseen datasets and surpasses other domain generalization and network regularization methods.
arXiv Detail & Related papers (2020-10-13T07:28:39Z) - Generative Hierarchical Features from Synthesizing Images [65.66756821069124]
We show that learning to synthesize images can bring remarkable hierarchical visual features that are generalizable across a wide range of applications.
The visual feature produced by our encoder, termed as Generative Hierarchical Feature (GH-Feat), has strong transferability to both generative and discriminative tasks.
arXiv Detail & Related papers (2020-07-20T18:04:14Z) - Network Bending: Expressive Manipulation of Deep Generative Models [0.2062593640149624]
We introduce a new framework for manipulating and interacting with deep generative models that we call network bending.
We show how it allows for the direct manipulation of semantically meaningful aspects of the generative process as well as allowing for a broad range of expressive outcomes.
arXiv Detail & Related papers (2020-05-25T21:48:45Z) - Continuous learning of face attribute synthesis [4.786157629600151]
The generative adversarial network (GAN) exhibits great superiority in the face attribute synthesis task.
Existing methods have very limited effects on the expansion of new attributes.
A continuous learning method for face attribute synthesis is proposed in this work.
arXiv Detail & Related papers (2020-04-15T06:44:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.