Autoregressive Unsupervised Image Segmentation
- URL: http://arxiv.org/abs/2007.08247v1
- Date: Thu, 16 Jul 2020 10:47:40 GMT
- Title: Autoregressive Unsupervised Image Segmentation
- Authors: Yassine Ouali, C\'eline Hudelot, Myriam Tami
- Abstract summary: We propose a new unsupervised image segmentation approach based on mutual information between different views constructed of the inputs.
The proposed method outperforms current state-of-the-art on unsupervised image segmentation.
- Score: 8.894935073145252
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this work, we propose a new unsupervised image segmentation approach based
on mutual information maximization between different constructed views of the
inputs. Taking inspiration from autoregressive generative models that predict
the current pixel from past pixels in a raster-scan ordering created with
masked convolutions, we propose to use different orderings over the inputs
using various forms of masked convolutions to construct different views of the
data. For a given input, the model produces a pair of predictions with two
valid orderings, and is then trained to maximize the mutual information between
the two outputs. These outputs can either be low-dimensional features for
representation learning or output clusters corresponding to semantic labels for
clustering. While masked convolutions are used during training, in inference,
no masking is applied and we fall back to the standard convolution where the
model has access to the full input. The proposed method outperforms current
state-of-the-art on unsupervised image segmentation. It is simple and easy to
implement, and can be extended to other visual tasks and integrated seamlessly
into existing unsupervised learning methods requiring different views of the
data.
Related papers
- One Diffusion to Generate Them All [54.82732533013014]
OneDiffusion is a versatile, large-scale diffusion model that supports bidirectional image synthesis and understanding.
It enables conditional generation from inputs such as text, depth, pose, layout, and semantic maps.
OneDiffusion allows for multi-view generation, camera pose estimation, and instant personalization using sequential image inputs.
arXiv Detail & Related papers (2024-11-25T12:11:05Z) - Masked Image Modeling Boosting Semi-Supervised Semantic Segmentation [38.55611683982936]
We introduce a novel class-wise masked image modeling that independently reconstructs different image regions according to their respective classes.
We develop a feature aggregation strategy that minimizes the distances between features corresponding to the masked and visible parts within the same class.
In semantic space, we explore the application of masked image modeling to enhance regularization.
arXiv Detail & Related papers (2024-11-13T16:42:07Z) - Hybrid diffusion models: combining supervised and generative pretraining for label-efficient fine-tuning of segmentation models [55.2480439325792]
We propose a new pretext task, which is to perform simultaneously image denoising and mask prediction on the first domain.
We show that fine-tuning a model pretrained using this approach leads to better results than fine-tuning a similar model trained using either supervised or unsupervised pretraining.
arXiv Detail & Related papers (2024-08-06T20:19:06Z) - Convolutional autoencoder-based multimodal one-class classification [80.52334952912808]
One-class classification refers to approaches of learning using data from a single class only.
We propose a deep learning one-class classification method suitable for multimodal data.
arXiv Detail & Related papers (2023-09-25T12:31:18Z) - With a Little Help from your own Past: Prototypical Memory Networks for
Image Captioning [47.96387857237473]
We devise a network which can perform attention over activations obtained while processing other training samples.
Our memory models the distribution of past keys and values through the definition of prototype vectors.
We demonstrate that our proposal can increase the performance of an encoder-decoder Transformer by 3.7 CIDEr points both when training in cross-entropy only and when fine-tuning with self-critical sequence training.
arXiv Detail & Related papers (2023-08-23T18:53:00Z) - A Semi-Paired Approach For Label-to-Image Translation [6.888253564585197]
We introduce the first semi-supervised (semi-paired) framework for label-to-image translation.
In the semi-paired setting, the model has access to a small set of paired data and a larger set of unpaired images and labels.
We propose a training algorithm for this shared network, and we present a rare classes sampling algorithm to focus on under-represented classes.
arXiv Detail & Related papers (2023-06-23T16:13:43Z) - BoundarySqueeze: Image Segmentation as Boundary Squeezing [104.43159799559464]
We propose a novel method for fine-grained high-quality image segmentation of both objects and scenes.
Inspired by dilation and erosion from morphological image processing techniques, we treat the pixel level segmentation problems as squeezing object boundary.
Our method yields large gains on COCO, Cityscapes, for both instance and semantic segmentation and outperforms previous state-of-the-art PointRend in both accuracy and speed under the same setting.
arXiv Detail & Related papers (2021-05-25T04:58:51Z) - Unsupervised Image Segmentation using Mutual Mean-Teaching [12.784209596867495]
We propose an unsupervised image segmentation model based on the Mutual Mean-Teaching (MMT) framework to produce more stable results.
Experimental results demonstrate that the proposed model is able to segment various types of images and achieves better performance than the existing methods.
arXiv Detail & Related papers (2020-12-16T13:13:34Z) - Efficient Full Image Interactive Segmentation by Leveraging Within-image
Appearance Similarity [39.17599924322882]
We propose a new approach to interactive full-image semantic segmentation.
We leverage a key observation: propagation from labeled to unlabeled pixels does not necessarily require class-specific knowledge.
We build on this observation and propose an approach capable of jointly propagating pixel labels from multiple classes.
arXiv Detail & Related papers (2020-07-16T08:21:59Z) - Unsupervised Learning of Visual Features by Contrasting Cluster
Assignments [57.33699905852397]
We propose an online algorithm, SwAV, that takes advantage of contrastive methods without requiring to compute pairwise comparisons.
Our method simultaneously clusters the data while enforcing consistency between cluster assignments.
Our method can be trained with large and small batches and can scale to unlimited amounts of data.
arXiv Detail & Related papers (2020-06-17T14:00:42Z) - OneGAN: Simultaneous Unsupervised Learning of Conditional Image
Generation, Foreground Segmentation, and Fine-Grained Clustering [100.32273175423146]
We present a method for simultaneously learning, in an unsupervised manner, a conditional image generator, foreground extraction and segmentation, and object removal and background completion.
The method combines a Geneversarative Adrial Network and a Variational Auto-Encoder, with multiple encoders, generators and discriminators, and benefits from solving all tasks at once.
arXiv Detail & Related papers (2019-12-31T18:15:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.