Semantics through Time: Semi-supervised Segmentation of Aerial Videos
with Iterative Label Propagation
- URL: http://arxiv.org/abs/2010.01910v1
- Date: Fri, 2 Oct 2020 15:15:50 GMT
- Title: Semantics through Time: Semi-supervised Segmentation of Aerial Videos
with Iterative Label Propagation
- Authors: Alina Marcu, Vlad Licaret, Dragos Costea and Marius Leordeanu
- Abstract summary: This paper makes an important step towards automatic annotation by introducing SegProp.
SegProp is a novel iterative flow-based method, with a direct connection to spectral clustering in space and time.
We introduce Ruralscapes, a new dataset with high resolution (4K) images and manually-annotated dense labels every 50 frames.
Our novel SegProp automatically annotates the remaining unlabeled 98% of frames with an accuracy exceeding 90%.
- Score: 16.478668565965243
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Semantic segmentation is a crucial task for robot navigation and safety.
However, current supervised methods require a large amount of pixelwise
annotations to yield accurate results. Labeling is a tedious and time consuming
process that has hampered progress in low altitude UAV applications. This paper
makes an important step towards automatic annotation by introducing SegProp, a
novel iterative flow-based method, with a direct connection to spectral
clustering in space and time, to propagate the semantic labels to frames that
lack human annotations. The labels are further used in semi-supervised learning
scenarios. Motivated by the lack of a large video aerial dataset, we also
introduce Ruralscapes, a new dataset with high resolution (4K) images and
manually-annotated dense labels every 50 frames - the largest of its kind, to
the best of our knowledge. Our novel SegProp automatically annotates the
remaining unlabeled 98% of frames with an accuracy exceeding 90% (F-measure),
significantly outperforming other state-of-the-art label propagation methods.
Moreover, when integrating other methods as modules inside SegProp's iterative
label propagation loop, we achieve a significant boost over the baseline
labels. Finally, we test SegProp in a full semi-supervised setting: we train
several state-of-the-art deep neural networks on the
SegProp-automatically-labeled training frames and test them on completely novel
videos. We convincingly demonstrate, every time, a significant improvement over
the supervised scenario.
Related papers
- Exploring Structured Semantic Prior for Multi Label Recognition with
Incomplete Labels [60.675714333081466]
Multi-label recognition (MLR) with incomplete labels is very challenging.
Recent works strive to explore the image-to-label correspondence in the vision-language model, ie, CLIP, to compensate for insufficient annotations.
We advocate remedying the deficiency of label supervision for the MLR with incomplete labels by deriving a structured semantic prior.
arXiv Detail & Related papers (2023-03-23T12:39:20Z) - LESS: Label-Efficient Semantic Segmentation for LiDAR Point Clouds [62.49198183539889]
We propose a label-efficient semantic segmentation pipeline for outdoor scenes with LiDAR point clouds.
Our method co-designs an efficient labeling process with semi/weakly supervised learning.
Our proposed method is even highly competitive compared to the fully supervised counterpart with 100% labels.
arXiv Detail & Related papers (2022-10-14T19:13:36Z) - Image Understands Point Cloud: Weakly Supervised 3D Semantic
Segmentation via Association Learning [59.64695628433855]
We propose a novel cross-modality weakly supervised method for 3D segmentation, incorporating complementary information from unlabeled images.
Basically, we design a dual-branch network equipped with an active labeling strategy, to maximize the power of tiny parts of labels.
Our method even outperforms the state-of-the-art fully supervised competitors with less than 1% actively selected annotations.
arXiv Detail & Related papers (2022-09-16T07:59:04Z) - Warp-Refine Propagation: Semi-Supervised Auto-labeling via
Cycle-consistency [27.77065474840873]
We propose a novel label propagation method that combines semantic cues with geometric cues to efficiently auto-label videos.
Our method learns to refine geometrically-warped labels and infuse them with learned semantic priors in a semi-supervised setting.
We quantitatively show that our method improves label-propagation by a noteworthy margin of 13.1 mIoU on the ApolloScape dataset.
arXiv Detail & Related papers (2021-09-28T02:04:18Z) - One Thing One Click: A Self-Training Approach for Weakly Supervised 3D
Semantic Segmentation [78.36781565047656]
We propose "One Thing One Click," meaning that the annotator only needs to label one point per object.
We iteratively conduct the training and label propagation, facilitated by a graph propagation module.
Our results are also comparable to those of the fully supervised counterparts.
arXiv Detail & Related papers (2021-04-06T02:27:25Z) - Reducing the Annotation Effort for Video Object Segmentation Datasets [50.893073670389164]
densely labeling every frame with pixel masks does not scale to large datasets.
We use a deep convolutional network to automatically create pseudo-labels on a pixel level from much cheaper bounding box annotations.
We obtain the new TAO-VOS benchmark, which we make publicly available at www.vision.rwth-aachen.de/page/taovos.
arXiv Detail & Related papers (2020-11-02T17:34:45Z) - PCAMs: Weakly Supervised Semantic Segmentation Using Point Supervision [12.284208932393073]
This paper presents a novel procedure for producing semantic segmentation from images given some point level annotations.
We propose training a CNN that is normally fully supervised using our pseudo labels in place of ground truth labels.
Our method achieves state of the art results for point supervised semantic segmentation on the PASCAL VOC 2012 dataset citeeveringham2010pascal, even outperforming state of the art methods for stronger bounding box and squiggle supervision.
arXiv Detail & Related papers (2020-07-10T21:25:27Z) - Labelling unlabelled videos from scratch with multi-modal
self-supervision [82.60652426371936]
unsupervised labelling of a video dataset does not come for free from strong feature encoders.
We propose a novel clustering method that allows pseudo-labelling of a video dataset without any human annotations.
An extensive analysis shows that the resulting clusters have high semantic overlap to ground truth human labels.
arXiv Detail & Related papers (2020-06-24T12:28:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.