Semantics-Guided Moving Object Segmentation with 3D LiDAR
- URL: http://arxiv.org/abs/2205.03186v1
- Date: Fri, 6 May 2022 12:59:54 GMT
- Title: Semantics-Guided Moving Object Segmentation with 3D LiDAR
- Authors: Shuo Gu, Suling Yao, Jian Yang and Hui Kong
- Abstract summary: Moving object segmentation (MOS) is a task to distinguish moving objects from the surrounding static environment.
We propose a semantics-guided convolutional neural network for moving object segmentation.
- Score: 32.84782551737681
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Moving object segmentation (MOS) is a task to distinguish moving objects,
e.g., moving vehicles and pedestrians, from the surrounding static environment.
The segmentation accuracy of MOS can have an influence on odometry, map
construction, and planning tasks. In this paper, we propose a semantics-guided
convolutional neural network for moving object segmentation. The network takes
sequential LiDAR range images as inputs. Instead of segmenting the moving
objects directly, the network conducts single-scan-based semantic segmentation
and multiple-scan-based moving object segmentation in turn. The semantic
segmentation module provides semantic priors for the MOS module, where we
propose an adjacent scan association (ASA) module to convert the semantic
features of adjacent scans into the same coordinate system to fully exploit the
cross-scan semantic features. Finally, by analyzing the difference between the
transformed features, reliable MOS result can be obtained quickly. Experimental
results on the SemanticKITTI MOS dataset proves the effectiveness of our work.
Related papers
- RISeg: Robot Interactive Object Segmentation via Body Frame-Invariant
Features [6.358423536732677]
We introduce a novel approach to correct inaccurate segmentation by using robot interaction and a designed body frame-invariant feature.
We demonstrate the effectiveness of our proposed interactive perception pipeline in accurately segmenting cluttered scenes by achieving an average object segmentation accuracy rate of 80.7%.
arXiv Detail & Related papers (2024-03-04T05:03:24Z) - SeMoLi: What Moves Together Belongs Together [51.72754014130369]
We tackle semi-supervised object detection based on motion cues.
Recent results suggest that motion-based clustering methods can be used to pseudo-label instances of moving objects.
We re-think this approach and suggest that both, object detection, as well as motion-inspired pseudo-labeling, can be tackled in a data-driven manner.
arXiv Detail & Related papers (2024-02-29T18:54:53Z) - Graph Information Bottleneck for Remote Sensing Segmentation [8.879224757610368]
This paper treats images as graph structures and introduces a simple contrastive vision GNN architecture for remote sensing segmentation.
Specifically, we construct a node-masked and edge-masked graph view to obtain an optimal graph structure representation.
We replace the convolutional module in UNet with the SC-ViG module to complete the segmentation and classification tasks.
arXiv Detail & Related papers (2023-12-05T07:23:22Z) - Event-Free Moving Object Segmentation from Moving Ego Vehicle [88.33470650615162]
Moving object segmentation (MOS) in dynamic scenes is an important, challenging, but under-explored research topic for autonomous driving.
Most segmentation methods leverage motion cues obtained from optical flow maps.
We propose to exploit event cameras for better video understanding, which provide rich motion cues without relying on optical flow.
arXiv Detail & Related papers (2023-04-28T23:43:10Z) - Semi-Weakly Supervised Object Kinematic Motion Prediction [56.282759127180306]
Given a 3D object, kinematic motion prediction aims to identify the mobile parts as well as the corresponding motion parameters.
We propose a graph neural network to learn the map between hierarchical part-level segmentation and mobile parts parameters.
The network predictions yield a large scale of 3D objects with pseudo labeled mobility information.
arXiv Detail & Related papers (2023-03-31T02:37:36Z) - Self-supervised Pre-training for Semantic Segmentation in an Indoor
Scene [8.357801312689622]
We propose RegConsist, a method for self-supervised pre-training of a semantic segmentation model.
We use a variant of contrastive learning to train a DCNN model for predicting semantic segmentation from RGB views in the target environment.
The proposed method outperforms models pre-trained on ImageNet and achieves competitive performance when using models that are trained for exactly the same task but on a different dataset.
arXiv Detail & Related papers (2022-10-04T20:10:14Z) - Segmenting Moving Objects via an Object-Centric Layered Representation [100.26138772664811]
We introduce an object-centric segmentation model with a depth-ordered layer representation.
We introduce a scalable pipeline for generating synthetic training data with multiple objects.
We evaluate the model on standard video segmentation benchmarks.
arXiv Detail & Related papers (2022-07-05T17:59:43Z) - Spatio-Temporal Multi-Task Learning Transformer for Joint Moving Object
Detection and Segmentation [0.0]
We present a Multi-Task Learning architecture, based on Transformers, to jointly perform both tasks through one network.
We evaluate the performance of the individual tasks architecture versus the MTL setup, both with early shared encoders, and late shared encoder-decoder transformers.
arXiv Detail & Related papers (2021-06-21T20:30:44Z) - DyStaB: Unsupervised Object Segmentation via Dynamic-Static
Bootstrapping [72.84991726271024]
We describe an unsupervised method to detect and segment portions of images of live scenes that are seen moving as a coherent whole.
Our method first partitions the motion field by minimizing the mutual information between segments.
It uses the segments to learn object models that can be used for detection in a static image.
arXiv Detail & Related papers (2020-08-16T22:05:13Z) - Fast Video Object Segmentation With Temporal Aggregation Network and
Dynamic Template Matching [67.02962970820505]
We introduce "tracking-by-detection" into Video Object (VOS)
We propose a new temporal aggregation network and a novel dynamic time-evolving template matching mechanism to achieve significantly improved performance.
We achieve new state-of-the-art performance on the DAVIS benchmark without complicated bells and whistles in both speed and accuracy, with a speed of 0.14 second per frame and J&F measure of 75.9% respectively.
arXiv Detail & Related papers (2020-07-11T05:44:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.