Standardized Max Logits: A Simple yet Effective Approach for Identifying
Unexpected Road Obstacles in Urban-Scene Segmentation
- URL: http://arxiv.org/abs/2107.11264v1
- Date: Fri, 23 Jul 2021 14:25:02 GMT
- Title: Standardized Max Logits: A Simple yet Effective Approach for Identifying
Unexpected Road Obstacles in Urban-Scene Segmentation
- Authors: Sanghun Jung, Jungsoo Lee, Daehoon Gwak, Sungha Choi, Jaegul Choo
- Abstract summary: We propose a simple yet effective approach that standardizes the max logits in order to align the different distributions and reflect the relative meanings of max logits within each predicted class.
Our method achieves a new state-of-the-art performance on the publicly available Fishyscapes Lost & Found leaderboard with a large margin.
- Score: 18.666365568765098
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Identifying unexpected objects on roads in semantic segmentation (e.g.,
identifying dogs on roads) is crucial in safety-critical applications. Existing
approaches use images of unexpected objects from external datasets or require
additional training (e.g., retraining segmentation networks or training an
extra network), which necessitate a non-trivial amount of labor intensity or
lengthy inference time. One possible alternative is to use prediction scores of
a pre-trained network such as the max logits (i.e., maximum values among
classes before the final softmax layer) for detecting such objects. However,
the distribution of max logits of each predicted class is significantly
different from each other, which degrades the performance of identifying
unexpected objects in urban-scene segmentation. To address this issue, we
propose a simple yet effective approach that standardizes the max logits in
order to align the different distributions and reflect the relative meanings of
max logits within each predicted class. Moreover, we consider the local regions
from two different perspectives based on the intuition that neighboring pixels
share similar semantic information. In contrast to previous approaches, our
method does not utilize any external datasets or require additional training,
which makes our method widely applicable to existing pre-trained segmentation
models. Such a straightforward approach achieves a new state-of-the-art
performance on the publicly available Fishyscapes Lost & Found leaderboard with
a large margin.
Related papers
- Pixel-wise Gradient Uncertainty for Convolutional Neural Networks
applied to Out-of-Distribution Segmentation [0.43512163406552007]
We present a method for obtaining uncertainty scores from pixel-wise loss gradients which can be computed efficiently during inference.
Our experiments show the ability of our method to identify wrong pixel classifications and to estimate prediction quality at negligible computational overhead.
arXiv Detail & Related papers (2023-03-13T08:37:59Z) - ALSO: Automotive Lidar Self-supervision by Occupancy estimation [70.70557577874155]
We propose a new self-supervised method for pre-training the backbone of deep perception models operating on point clouds.
The core idea is to train the model on a pretext task which is the reconstruction of the surface on which the 3D points are sampled.
The intuition is that if the network is able to reconstruct the scene surface, given only sparse input points, then it probably also captures some fragments of semantic information.
arXiv Detail & Related papers (2022-12-12T13:10:19Z) - Segmenting Known Objects and Unseen Unknowns without Prior Knowledge [86.46204148650328]
holistic segmentation aims to identify and separate objects of unseen, unknown categories into instances without any prior knowledge about them.
We tackle this new problem with U3HS, which finds unknowns as highly uncertain regions and clusters their corresponding instance-aware embeddings into individual objects.
Experiments on public data from MS, Cityscapes, and Lost&Found demonstrate the effectiveness of U3HS.
arXiv Detail & Related papers (2022-09-12T16:59:36Z) - Few-Max: Few-Shot Domain Adaptation for Unsupervised Contrastive
Representation Learning [7.748713051083396]
Contrastive self-supervised learning methods learn to map data points such as images into non-parametric representation space without requiring labels.
We propose a domain adaption method for self-supervised contrastive learning, termed Few-Max, to address the issue of adaptation to a target distribution under few-shot learning.
We evaluate Few-Max on a range of source and target datasets, including ImageNet, VisDA, and fastMRI, on which Few-Max consistently outperforms other approaches.
arXiv Detail & Related papers (2022-06-21T06:46:19Z) - Multi-dataset Pretraining: A Unified Model for Semantic Segmentation [97.61605021985062]
We propose a unified framework, termed as Multi-Dataset Pretraining, to take full advantage of the fragmented annotations of different datasets.
This is achieved by first pretraining the network via the proposed pixel-to-prototype contrastive loss over multiple datasets.
In order to better model the relationship among images and classes from different datasets, we extend the pixel level embeddings via cross dataset mixing.
arXiv Detail & Related papers (2021-06-08T06:13:11Z) - SegmentMeIfYouCan: A Benchmark for Anomaly Segmentation [111.61261419566908]
Deep neural networks (DNNs) are usually trained on a closed set of semantic classes.
They are ill-equipped to handle previously-unseen objects.
detecting and localizing such objects is crucial for safety-critical applications such as perception for automated driving.
arXiv Detail & Related papers (2021-04-30T07:58:19Z) - Unsupervised Semantic Segmentation by Contrasting Object Mask Proposals [78.12377360145078]
We introduce a novel two-step framework that adopts a predetermined prior in a contrastive optimization objective to learn pixel embeddings.
This marks a large deviation from existing works that relied on proxy tasks or end-to-end clustering.
In particular, when fine-tuning the learned representations using just 1% of labeled examples on PASCAL, we outperform supervised ImageNet pre-training by 7.1% mIoU.
arXiv Detail & Related papers (2021-02-11T18:54:47Z) - TraND: Transferable Neighborhood Discovery for Unsupervised Cross-domain
Gait Recognition [77.77786072373942]
This paper proposes a Transferable Neighborhood Discovery (TraND) framework to bridge the domain gap for unsupervised cross-domain gait recognition.
We design an end-to-end trainable approach to automatically discover the confident neighborhoods of unlabeled samples in the latent space.
Our method achieves state-of-the-art results on two public datasets, i.e., CASIA-B and OU-LP.
arXiv Detail & Related papers (2021-02-09T03:07:07Z) - Hyperspherical embedding for novel class classification [1.5952956981784217]
We present a constraint-based approach applied to representations in the latent space under the normalized softmax loss.
We experimentally validate the proposed approach for the classification of unseen classes on different datasets using both metric learning and the normalized softmax loss.
Our results show that not only our proposed strategy can be efficiently trained on larger set of classes, as it does not require pairwise learning, but also present better classification results than the metric learning strategies.
arXiv Detail & Related papers (2021-02-05T15:42:13Z) - Find it if You Can: End-to-End Adversarial Erasing for Weakly-Supervised
Semantic Segmentation [6.326017213490535]
We propose a novel formulation of adversarial erasing of the attention maps.
The proposed solution does not require saliency masks, instead it uses a regularization loss to prevent the attention maps from spreading to less discriminative object regions.
Our experiments on the Pascal VOC dataset demonstrate that our adversarial approach increases segmentation performance by 2.1 mIoU compared to our baseline and by 1.0 mIoU compared to previous adversarial erasing approaches.
arXiv Detail & Related papers (2020-11-09T18:35:35Z) - Reinforced active learning for image segmentation [34.096237671643145]
We present a new active learning strategy for semantic segmentation based on deep reinforcement learning (RL)
An agent learns a policy to select a subset of small informative image regions -- opposed to entire images -- to be labeled from a pool of unlabeled data.
Our method proposes a new modification of the deep Q-network (DQN) formulation for active learning, adapting it to the large-scale nature of semantic segmentation problems.
arXiv Detail & Related papers (2020-02-16T14:03:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.