Unsupervised Learning Consensus Model for Dynamic Texture Videos
Segmentation
- URL: http://arxiv.org/abs/2006.16177v1
- Date: Mon, 29 Jun 2020 16:40:59 GMT
- Title: Unsupervised Learning Consensus Model for Dynamic Texture Videos
Segmentation
- Authors: Lazhar Khelifi and Max Mignotte
- Abstract summary: We present an effective unsupervised learning consensus model for the segmentation of dynamic texture (ULCM)
In the proposed model, the set of values of the requantized local binary patterns (LBP) histogram around the pixel to be classified are used as features.
Experiments conducted on the challenging SynthDB dataset show that ULCM is significantly faster, easier to code, simple and has limited parameters.
- Score: 12.462608802359936
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Dynamic texture (DT) segmentation, and video processing in general, is
currently widely dominated by methods based on deep neural networks that
require the deployment of a large number of layers. Although this parametric
approach has shown superior performances for the dynamic texture segmentation,
all current deep learning methods suffer from a significant main weakness
related to the lack of a sufficient reference annotation to train models and to
make them functional. This study explores the unsupervised segmentation
approach that can be used in the absence of training data to segment new
videos. We present an effective unsupervised learning consensus model for the
segmentation of dynamic texture (ULCM). This model is designed to merge
different segmentation maps that contain multiple and weak quality regions in
order to achieve a more accurate final result of segmentation. The diverse
labeling fields required for the combination process are obtained by a
simplified grouping scheme applied to an input video (on the basis of a three
orthogonal planes: xy, yt and xt). In the proposed model, the set of values of
the requantized local binary patterns (LBP) histogram around the pixel to be
classified are used as features which represent both the spatial and temporal
information replicated in the video. Experiments conducted on the challenging
SynthDB dataset show that, contrary to current dynamic texture segmentation
approaches that either require parameter estimation or a training step, ULCM is
significantly faster, easier to code, simple and has limited parameters.
Further qualitative experiments based on the YUP++ dataset prove the
efficiently and competitively of the ULCM.
Related papers
- SIGMA:Sinkhorn-Guided Masked Video Modeling [69.31715194419091]
Sinkhorn-guided Masked Video Modelling ( SIGMA) is a novel video pretraining method.
We distribute features of space-time tubes evenly across a limited number of learnable clusters.
Experimental results on ten datasets validate the effectiveness of SIGMA in learning more performant, temporally-aware, and robust video representations.
arXiv Detail & Related papers (2024-07-22T08:04:09Z) - Appearance-Based Refinement for Object-Centric Motion Segmentation [85.2426540999329]
We introduce an appearance-based refinement method that leverages temporal consistency in video streams to correct inaccurate flow-based proposals.
Our approach involves a sequence-level selection mechanism that identifies accurate flow-predicted masks as exemplars.
Its performance is evaluated on multiple video segmentation benchmarks, including DAVIS, YouTube, SegTrackv2, and FBMS-59.
arXiv Detail & Related papers (2023-12-18T18:59:51Z) - Video-SwinUNet: Spatio-temporal Deep Learning Framework for VFSS
Instance Segmentation [10.789826145990016]
This paper presents a deep learning framework for medical video segmentation.
Our framework explicitly extracts features from neighbouring frames across the temporal dimension.
It incorporates them with a temporal feature blender, which then tokenises the high-level-temporal feature to form a strong global feature encoded via a Swin Transformer.
arXiv Detail & Related papers (2023-02-22T12:09:39Z) - Efficient Unsupervised Video Object Segmentation Network Based on Motion
Guidance [1.5736899098702974]
This paper proposes a video object segmentation network based on motion guidance.
The model comprises a dual-stream network, motion guidance module, and multi-scale progressive fusion module.
The experimental results prove the superior performance of the proposed method.
arXiv Detail & Related papers (2022-11-10T06:13:23Z) - Dynamic Latent Separation for Deep Learning [67.62190501599176]
A core problem in machine learning is to learn expressive latent variables for model prediction on complex data.
Here, we develop an approach that improves expressiveness, provides partial interpretation, and is not restricted to specific applications.
arXiv Detail & Related papers (2022-10-07T17:56:53Z) - Learning from Temporal Spatial Cubism for Cross-Dataset Skeleton-based
Action Recognition [88.34182299496074]
Action labels are only available on a source dataset, but unavailable on a target dataset in the training stage.
We utilize a self-supervision scheme to reduce the domain shift between two skeleton-based action datasets.
By segmenting and permuting temporal segments or human body parts, we design two self-supervised learning classification tasks.
arXiv Detail & Related papers (2022-07-17T07:05:39Z) - Segmenting Moving Objects via an Object-Centric Layered Representation [100.26138772664811]
We introduce an object-centric segmentation model with a depth-ordered layer representation.
We introduce a scalable pipeline for generating synthetic training data with multiple objects.
We evaluate the model on standard video segmentation benchmarks.
arXiv Detail & Related papers (2022-07-05T17:59:43Z) - Multi-scale and Cross-scale Contrastive Learning for Semantic
Segmentation [5.281694565226513]
We apply contrastive learning to enhance the discriminative power of the multi-scale features extracted by semantic segmentation networks.
By first mapping the encoder's multi-scale representations to a common feature space, we instantiate a novel form of supervised local-global constraint.
arXiv Detail & Related papers (2022-03-25T01:24:24Z) - Dynamic Dual Sampling Module for Fine-Grained Semantic Segmentation [27.624291416260185]
We propose a Dynamic Dual Sampling Module (DDSM) to conduct dynamic affinity modeling and propagate semantic context to local details.
Experiment results on both City and Camvid datasets validate the effectiveness and efficiency of the proposed approach.
arXiv Detail & Related papers (2021-05-25T04:25:47Z) - Reviving Iterative Training with Mask Guidance for Interactive
Segmentation [8.271859911016719]
Recent works on click-based interactive segmentation have demonstrated state-of-the-art results by using various inference-time optimization schemes.
We propose a simple feedforward model for click-based interactive segmentation that employs the segmentation masks from previous steps.
We find that the models trained on a combination of COCO and LVIS with diverse and high-quality annotations show performance superior to all existing models.
arXiv Detail & Related papers (2021-02-12T15:44:31Z) - Learning Fast and Robust Target Models for Video Object Segmentation [83.3382606349118]
Video object segmentation (VOS) is a highly challenging problem since the initial mask, defining the target object, is only given at test-time.
Most previous approaches fine-tune segmentation networks on the first frame, resulting in impractical frame-rates and risk of overfitting.
We propose a novel VOS architecture consisting of two network components.
arXiv Detail & Related papers (2020-02-27T21:58:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.