Introducing Resizable Region Packing Problem in Image Generation, with a Heuristic Solution
- URL: http://arxiv.org/abs/2509.16363v1
- Date: Fri, 19 Sep 2025 19:11:31 GMT
- Title: Introducing Resizable Region Packing Problem in Image Generation, with a Heuristic Solution
- Authors: Hrishikesh Sharma,
- Abstract summary: We introduce a novel, practically useful manifestation of the classical Bin Packing problem in the context of generation of synthetic image data.<n>We present a novel algorithm that is generic enough and therefore scales and packs arbitrary number of arbitrary-shaped regions at arbitrary locations.<n>The algorithm is validated by an implementation that was used to generate a large-scale synthetic anomaly detection dataset.
- Score: 1.6921396880325779
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The problem of image data generation in computer vision has traditionally been a harder problem to solve, than discriminative problems. Such data generation entails placing relevant objects of appropriate sizes each, at meaningful location in a scene canvas. There have been two classes of popular approaches to such generation: graphics based, and generative models-based. Optimization problems are known to lurk in the background for both these classes of approaches. In this paper, we introduce a novel, practically useful manifestation of the classical Bin Packing problem in the context of generation of synthetic image data. We conjecture that the newly introduced problem, Resizable Anchored Region Packing(RARP) Problem, is NP-hard, and provide detailed arguments about our conjecture. As a first solution, we present a novel heuristic algorithm that is generic enough and therefore scales and packs arbitrary number of arbitrary-shaped regions at arbitrary locations, into an image canvas. The algorithm follows greedy approach to iteratively pack region pairs in a careful way, while obeying the optimization constraints. The algorithm is validated by an implementation that was used to generate a large-scale synthetic anomaly detection dataset, with highly varying degree of bin packing parameters per image sample i.e. RARP instance. Visual inspection of such data and checking of the correctness of each solution proves the effectiveness of our algorithm. With generative modeling being on rise in deep learning, and synthetic data generation poised to become mainstream, we expect that the newly introduced problem will be valued in the imaging scientific community.
Related papers
- PolygoNet: Leveraging Simplified Polygonal Representation for Effective Image Classification [6.3286311412189304]
We propose an efficient approach that leverages polygonal representations of images using dominant points or contour coordinates.<n>Our method significantly reduces computational requirements, accelerates training, and conserves resources.<n>Experiments on benchmark datasets validate the effectiveness of our approach in reducing complexity, improving generalization, and facilitating edge computing applications.
arXiv Detail & Related papers (2025-04-01T22:05:00Z) - Multi-Feature Aggregation in Diffusion Models for Enhanced Face Super-Resolution [6.055006354743854]
We develop an algorithm that utilize a low-resolution image combined with features extracted from multiple low-quality images to generate a super-resolved image.
Unlike other algorithms, our approach recovers facial features without explicitly providing attribute information.
This is the first time multi-features combined with low-resolution images are used as conditioners to generate more reliable super-resolution images.
arXiv Detail & Related papers (2024-08-27T20:08:33Z) - Learning from small data sets: Patch-based regularizers in inverse
problems for image reconstruction [1.1650821883155187]
Recent advances in machine learning require a huge amount of data and computer capacity to train the networks.
Our paper addresses the issue of learning from small data sets by taking patches of very few images into account.
We show how we can achieve uncertainty quantification by approximating the posterior using Langevin Monte Carlo methods.
arXiv Detail & Related papers (2023-12-27T15:30:05Z) - Distance Weighted Trans Network for Image Completion [52.318730994423106]
We propose a new architecture that relies on Distance-based Weighted Transformer (DWT) to better understand the relationships between an image's components.
CNNs are used to augment the local texture information of coarse priors.
DWT blocks are used to recover certain coarse textures and coherent visual structures.
arXiv Detail & Related papers (2023-10-11T12:46:11Z) - High-Resolution GAN Inversion for Degraded Images in Large Diverse
Datasets [39.21692649763314]
In this paper, we present a novel GAN inversion framework that utilizes the powerful generative ability of StyleGAN-XL.
To ease the inversion challenge with StyleGAN-XL, Clustering & Regularize Inversion (CRI) is proposed.
We validate our CRI scheme on multiple restoration tasks (i.e., inpainting, colorization, and super-resolution) of complex natural images, and show preferable quantitative and qualitative results.
arXiv Detail & Related papers (2023-02-07T11:24:11Z) - Curvature regularization for Non-line-of-sight Imaging from
Under-sampled Data [5.591221518341613]
Non-line-of-sight (NLOS) imaging aims to reconstruct the three-dimensional hidden scenes from the data measured in the line-of-sight.
We propose novel NLOS reconstruction models based on curvature regularization.
We evaluate the proposed algorithms on both synthetic and real datasets.
arXiv Detail & Related papers (2023-01-01T14:10:43Z) - Controllable Person Image Synthesis with Spatially-Adaptive Warped
Normalization [72.65828901909708]
Controllable person image generation aims to produce realistic human images with desirable attributes.
We introduce a novel Spatially-Adaptive Warped Normalization (SAWN), which integrates a learned flow-field to warp modulation parameters.
We propose a novel self-training part replacement strategy to refine the pretrained model for the texture-transfer task.
arXiv Detail & Related papers (2021-05-31T07:07:44Z) - InfinityGAN: Towards Infinite-Resolution Image Synthesis [92.40782797030977]
We present InfinityGAN, a method to generate arbitrary-resolution images.
We show how it trains and infers patch-by-patch seamlessly with low computational resources.
arXiv Detail & Related papers (2021-04-08T17:59:30Z) - Deep Variational Network Toward Blind Image Restoration [60.45350399661175]
Blind image restoration is a common yet challenging problem in computer vision.
We propose a novel blind image restoration method, aiming to integrate both the advantages of them.
Experiments on two typical blind IR tasks, namely image denoising and super-resolution, demonstrate that the proposed method achieves superior performance over current state-of-the-arts.
arXiv Detail & Related papers (2020-08-25T03:30:53Z) - A Flexible Framework for Designing Trainable Priors with Adaptive
Smoothing and Game Encoding [57.1077544780653]
We introduce a general framework for designing and training neural network layers whose forward passes can be interpreted as solving non-smooth convex optimization problems.
We focus on convex games, solved by local agents represented by the nodes of a graph and interacting through regularization functions.
This approach is appealing for solving imaging problems, as it allows the use of classical image priors within deep models that are trainable end to end.
arXiv Detail & Related papers (2020-06-26T08:34:54Z) - The Power of Triply Complementary Priors for Image Compressive Sensing [89.14144796591685]
We propose a joint low-rank deep (LRD) image model, which contains a pair of complementaryly trip priors.
We then propose a novel hybrid plug-and-play framework based on the LRD model for image CS.
To make the optimization tractable, a simple yet effective algorithm is proposed to solve the proposed H-based image CS problem.
arXiv Detail & Related papers (2020-05-16T08:17:44Z) - Image Fine-grained Inpainting [89.17316318927621]
We present a one-stage model that utilizes dense combinations of dilated convolutions to obtain larger and more effective receptive fields.
To better train this efficient generator, except for frequently-used VGG feature matching loss, we design a novel self-guided regression loss.
We also employ a discriminator with local and global branches to ensure local-global contents consistency.
arXiv Detail & Related papers (2020-02-07T03:45:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.