Conditional and Residual Methods in Scalable Coding for Humans and
Machines
- URL: http://arxiv.org/abs/2305.02562v2
- Date: Tue, 4 Jul 2023 23:27:16 GMT
- Title: Conditional and Residual Methods in Scalable Coding for Humans and
Machines
- Authors: Anderson de Andrade, Alon Harell, Yalda Foroutan, Ivan V. Baji\'c
- Abstract summary: We present methods for conditional and residual coding in the context of scalable coding for humans and machines.
Our focus is on optimizing the rate-distortion performance of the reconstruction task using the information available in the computer vision task.
- Score: 26.32381277880991
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: We present methods for conditional and residual coding in the context of
scalable coding for humans and machines. Our focus is on optimizing the
rate-distortion performance of the reconstruction task using the information
available in the computer vision task. We include an information analysis of
both approaches to provide baselines and also propose an entropy model suitable
for conditional coding with increased modelling capacity and similar
tractability as previous work. We apply these methods to image reconstruction,
using, in one instance, representations created for semantic segmentation on
the Cityscapes dataset, and in another instance, representations created for
object detection on the COCO dataset. In both experiments, we obtain similar
performance between the conditional and residual methods, with the resulting
rate-distortion curves contained within our baselines.
Related papers
- Detection Based Part-level Articulated Object Reconstruction from Single RGBD Image [52.11275397911693]
We propose an end-to-end trainable, cross-category method for reconstructing multiple man-made articulated objects from a single RGBD image.
We depart from previous works that rely on learning instance-level latent space, focusing on man-made articulated objects with predefined part counts.
Our method successfully reconstructs variously structured multiple instances that previous works cannot handle, and outperforms prior works in shape reconstruction and kinematics estimation.
arXiv Detail & Related papers (2025-04-04T05:08:04Z) - Sparse Dictionary Learning for Image Recovery by Iterative Shrinkage [0.1433758865948252]
We study the sparse coding problem in the context of sparse dictionary learning for image recovery.
We consider and compare several state-of-the-art sparse optimization methods constructed using the shrinkage operation.
arXiv Detail & Related papers (2025-03-13T13:45:37Z) - SeNM-VAE: Semi-Supervised Noise Modeling with Hierarchical Variational Autoencoder [13.453138169497903]
SeNM-VAE is a semi-supervised noise modeling method that leverages both paired and unpaired datasets to generate realistic degraded data.
We employ our method to generate paired training samples for real-world image denoising and super-resolution tasks.
Our approach excels in the quality of synthetic degraded images compared to other unpaired and paired noise modeling methods.
arXiv Detail & Related papers (2024-03-26T09:03:40Z) - Corner-to-Center Long-range Context Model for Efficient Learned Image
Compression [70.0411436929495]
In the framework of learned image compression, the context model plays a pivotal role in capturing the dependencies among latent representations.
We propose the textbfCorner-to-Center transformer-based Context Model (C$3$M) designed to enhance context and latent predictions.
In addition, to enlarge the receptive field in the analysis and synthesis transformation, we use the Long-range Crossing Attention Module (LCAM) in the encoder/decoder.
arXiv Detail & Related papers (2023-11-29T21:40:28Z) - Reconstructing Spatiotemporal Data with C-VAEs [49.1574468325115]
Conditional continuous representation of moving regions is commonly used.
In this work, we explore the capabilities of Conditional Varitemporal Autoencoder (C-VAE) models to generate realistic representations of regions' evolution.
arXiv Detail & Related papers (2023-07-12T15:34:10Z) - Augmentation Invariance and Adaptive Sampling in Semantic Segmentation
of Agricultural Aerial Images [16.101248613062292]
We investigate the problem of Semantic for agricultural aerial imagery.
The existing methods used for this task are designed without considering two characteristics of the aerial data.
We propose a solution based on two ideas: (i) we use together a set of suitable augmentation and a consistency loss to guide the model to learn semantic representations that are invariant to the photometric and geometric shifts typical of the top-down perspective.
With an extensive set of experiments conducted on the Agriculture-Vision dataset, we demonstrate that our proposed strategies improve the performance of the current state-of-the-art method.
arXiv Detail & Related papers (2022-04-17T10:19:07Z) - Conditional Variational Autoencoder for Learned Image Reconstruction [5.487951901731039]
We develop a novel framework that approximates the posterior distribution of the unknown image at each query observation.
It handles implicit noise models and priors, it incorporates the data formation process (i.e., the forward operator), and the learned reconstructive properties are transferable between different datasets.
arXiv Detail & Related papers (2021-10-22T10:02:48Z) - An Adaptive Framework for Learning Unsupervised Depth Completion [59.17364202590475]
We present a method to infer a dense depth map from a color image and associated sparse depth measurements.
We show that regularization and co-visibility are related via the fitness of the model to data and can be unified into a single framework.
arXiv Detail & Related papers (2021-06-06T02:27:55Z) - MINIMALIST: Mutual INformatIon Maximization for Amortized Likelihood
Inference from Sampled Trajectories [61.3299263929289]
Simulation-based inference enables learning the parameters of a model even when its likelihood cannot be computed in practice.
One class of methods uses data simulated with different parameters to infer an amortized estimator for the likelihood-to-evidence ratio.
We show that this approach can be formulated in terms of mutual information between model parameters and simulated data.
arXiv Detail & Related papers (2021-06-03T12:59:16Z) - Mixing Consistent Deep Clustering [3.5786621294068373]
Good latent representations produce semantically mixed outputs when decoding linears of two latent representations.
We propose the Mixing Consistent Deep Clustering method which encourages representations to appear realistic.
We show that the proposed method can be added to existing autoencoders to further improve clustering performance.
arXiv Detail & Related papers (2020-11-03T19:47:06Z) - Set Based Stochastic Subsampling [85.5331107565578]
We propose a set-based two-stage end-to-end neural subsampling model that is jointly optimized with an textitarbitrary downstream task network.
We show that it outperforms the relevant baselines under low subsampling rates on a variety of tasks including image classification, image reconstruction, function reconstruction and few-shot classification.
arXiv Detail & Related papers (2020-06-25T07:36:47Z) - MetaSDF: Meta-learning Signed Distance Functions [85.81290552559817]
Generalizing across shapes with neural implicit representations amounts to learning priors over the respective function space.
We formalize learning of a shape space as a meta-learning problem and leverage gradient-based meta-learning algorithms to solve this task.
arXiv Detail & Related papers (2020-06-17T05:14:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.