Autoencoder-based background reconstruction and foreground segmentation
with background noise estimation
- URL: http://arxiv.org/abs/2112.08001v1
- Date: Wed, 15 Dec 2021 09:51:00 GMT
- Title: Autoencoder-based background reconstruction and foreground segmentation
with background noise estimation
- Authors: Bruno Sauvalle and Arnaud de La Fortelle
- Abstract summary: We propose in this paper to model the background of a video sequence as a low dimensional manifold using an autoencoder.
The main novelty of the proposed model is that the autoencoder is also trained to predict the background noise, which allows to compute for each frame a pixel-dependent threshold.
Although the proposed model does not use any temporal or motion information, it exceeds the state of the art for unsupervised background subtraction on the CDnet 2014 and LASIESTA datasets.
- Score: 1.3706331473063877
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Even after decades of research, dynamic scene background reconstruction and
foreground object segmentation are still considered as open problems due
various challenges such as illumination changes, camera movements, or
background noise caused by air turbulence or moving trees. We propose in this
paper to model the background of a video sequence as a low dimensional manifold
using an autoencoder and to compare the reconstructed background provided by
this autoencoder with the original image to compute the foreground/background
segmentation masks. The main novelty of the proposed model is that the
autoencoder is also trained to predict the background noise, which allows to
compute for each frame a pixel-dependent threshold to perform the
background/foreground segmentation. Although the proposed model does not use
any temporal or motion information, it exceeds the state of the art for
unsupervised background subtraction on the CDnet 2014 and LASIESTA datasets,
with a significant improvement on videos where the camera is moving.
Related papers
- Beyond Image Prior: Embedding Noise Prior into Conditional Denoising Transformer [17.430622649002427]
Existing learning-based denoising methods typically train models to generalize the image prior from large-scale datasets.
We propose a new perspective on the denoising challenge by highlighting the distinct separation between noise and image priors.
We introduce a Locally Noise Prior Estimation algorithm, which accurately estimates the noise prior directly from a single raw noisy image.
arXiv Detail & Related papers (2024-07-12T08:43:11Z) - TRIP: Temporal Residual Learning with Image Noise Prior for Image-to-Video Diffusion Models [94.24861019513462]
TRIP is a new recipe of image-to-video diffusion paradigm.
It pivots on image noise prior derived from static image to jointly trigger inter-frame relational reasoning.
Extensive experiments on WebVid-10M, DTDB and MSR-VTT datasets demonstrate TRIP's effectiveness.
arXiv Detail & Related papers (2024-03-25T17:59:40Z) - Seeing Behind Dynamic Occlusions with Event Cameras [44.63007080623054]
We propose a novel approach to reconstruct the background from a single viewpoint.
Our solution relies for the first time on the combination of a traditional camera with an event camera.
We show that our method outperforms image inpainting methods by 3dB in terms of PSNR on our dataset.
arXiv Detail & Related papers (2023-07-28T22:20:52Z) - Self-Supervised Video Object Segmentation via Cutout Prediction and
Tagging [117.73967303377381]
We propose a novel self-supervised Video Object (VOS) approach that strives to achieve better object-background discriminability.
Our approach is based on a discriminative learning loss formulation that takes into account both object and background information.
Our proposed approach, CT-VOS, achieves state-of-the-art results on two challenging benchmarks: DAVIS-2017 and Youtube-VOS.
arXiv Detail & Related papers (2022-04-22T17:53:27Z) - Saliency detection with moving camera via background model completion [0.5076419064097734]
We propose a new framework called saliency detection via background model completion (SDBMC)
It comprises a background modeler and deep learning background/foreground segmentation network.
We adopt the background/foreground segmenter, although pre-trained with a specific video dataset, can also detect saliency in unseen videos.
arXiv Detail & Related papers (2021-10-30T11:17:58Z) - NeuralDiff: Segmenting 3D objects that move in egocentric videos [92.95176458079047]
We study the problem of decomposing the observed 3D scene into a static background and a dynamic foreground.
This task is reminiscent of the classic background subtraction problem, but is significantly harder because all parts of the scene, static and dynamic, generate a large apparent motion.
In particular, we consider egocentric videos and further separate the dynamic component into objects and the actor that observes and moves them.
arXiv Detail & Related papers (2021-10-19T12:51:35Z) - Restoration of Video Frames from a Single Blurred Image with Motion
Understanding [69.90724075337194]
We propose a novel framework to generate clean video frames from a single motion-red image.
We formulate video restoration from a single blurred image as an inverse problem by setting clean image sequence and their respective motion as latent factors.
Our framework is based on anblur-decoder structure with spatial transformer network modules.
arXiv Detail & Related papers (2021-04-19T08:32:57Z) - Co-occurrence Background Model with Superpixels for Robust Background
Initialization [10.955692396874678]
We develop a co-occurrence background model with superpixel segmentation.
Results obtained from the dataset of the challenging benchmark(SBMnet)validate it's performance under various challenges.
arXiv Detail & Related papers (2020-03-29T02:48:41Z) - BachGAN: High-Resolution Image Synthesis from Salient Object Layout [78.51640906030244]
We propose a new task towards more practical application for image generation - high-quality image synthesis from salient object layout.
Two main challenges spring from this new task: (i) how to generate fine-grained details and realistic textures without segmentation map input; and (ii) how to create a background and weave it seamlessly into standalone objects.
By generating the hallucinated background representation dynamically, our model can synthesize high-resolution images with both photo-realistic foreground and integral background.
arXiv Detail & Related papers (2020-03-26T00:54:44Z) - Deep Blind Video Super-resolution [85.79696784460887]
We propose a deep convolutional neural network (CNN) model to solve video SR by a blur kernel modeling approach.
The proposed CNN model consists of motion blur estimation, motion estimation, and latent image restoration modules.
We show that the proposed algorithm is able to generate clearer images with finer structural details.
arXiv Detail & Related papers (2020-03-10T13:43:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.