Related papers: Enhancing Monocular Height Estimation via Weak Supervision from Imperfect Labels

Enhancing Monocular Height Estimation via Weak Supervision from Imperfect Labels

URL: http://arxiv.org/abs/2506.02534v1
Date: Tue, 03 Jun 2025 07:14:16 GMT
Title: Enhancing Monocular Height Estimation via Weak Supervision from Imperfect Labels
Authors: Sining Chen, Yilei Shi, Xiao Xiang Zhu,
Abstract summary: We introduce data with imperfect labels into training pixel-wise height estimation networks.<n>We propose an ensemble-based pipeline compatible with any monocular height estimation network.
Score: 17.495701574116087
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Monocular height estimation is considered the most efficient and cost-effective means of 3D perception in remote sensing, and it has attracted much attention since the emergence of deep learning. While training neural networks requires a large amount of data, data with perfect labels are scarce and only available within developed regions. The trained models therefore lack generalizability, which limits the potential for large-scale application of existing methods. We tackle this problem for the first time, by introducing data with imperfect labels into training pixel-wise height estimation networks, including labels that are incomplete, inexact, and inaccurate compared to high-quality labels. We propose an ensemble-based pipeline compatible with any monocular height estimation network. Taking the challenges of noisy labels, domain shift, and long-tailed distribution of height values into consideration, we carefully design the architecture and loss functions to leverage the information concealed in imperfect labels using weak supervision through balanced soft losses and ordinal constraints. We conduct extensive experiments on two datasets with different resolutions, DFC23 (0.5 to 1 m) and GBH (3 m). The results indicate that the proposed pipeline outperforms baselines by achieving more balanced performance across various domains, leading to improvements of average root mean square errors up to 22.94 %, and 18.62 % on DFC23 and GBH, respectively. The efficacy of each design component is validated through ablation studies. Code is available at https://github.com/zhu-xlab/weakim2h.

Related papers

Continuous Contrastive Learning for Long-Tailed Semi-Supervised Recognition [50.61991746981703]
Current state-of-the-art LTSSL approaches rely on high-quality pseudo-labels for large-scale unlabeled data. This paper introduces a novel probabilistic framework that unifies various recent proposals in long-tail learning. We introduce a continuous contrastive learning method, CCL, extending our framework to unlabeled data using reliable and smoothed pseudo-labels.
arXiv Detail & Related papers (2024-10-08T15:06:10Z)
Towards Modality-agnostic Label-efficient Segmentation with Entropy-Regularized Distribution Alignment [62.73503467108322]
This topic is widely studied in 3D point cloud segmentation due to the difficulty of annotating point clouds densely. Until recently, pseudo-labels have been widely employed to facilitate training with limited ground-truth labels. Existing pseudo-labeling approaches could suffer heavily from the noises and variations in unlabelled data. We propose a novel learning strategy to regularize the pseudo-labels generated for training, thus effectively narrowing the gaps between pseudo-labels and model predictions.
arXiv Detail & Related papers (2024-08-29T13:31:15Z)
An Embedding is Worth a Thousand Noisy Labels [0.11999555634662634]
We propose WANN, a weighted Adaptive Nearest Neighbor approach to address label noise.<n>We show WANN outperforms reference methods on diverse datasets of varying size and under various noise types and severities.<n>Our approach, emphasizing efficiency and explainability, emerges as a simple, robust solution to overcome inherent limitations of deep neural network training.
arXiv Detail & Related papers (2024-08-26T15:32:31Z)
Multi-label Sewer Pipe Defect Recognition with Mask Attention Feature Enhancement and Label Correlation Learning [5.9184143707401775]
Multi-label pipe defect recognition is proposed based on mask attention guided feature enhancement and label correlation learning. The proposed method can achieve current approximate state-of-the-art classification performance using just 1/16 of the Sewer-ML training dataset.
arXiv Detail & Related papers (2024-08-01T11:51:50Z)
Weakly-supervised positional contrastive learning: application to cirrhosis classification [45.63061034568991]
Large medical imaging datasets can be cheaply annotated with low-confidence, weak labels. Access to high-confidence labels, such as histology-based diagnoses, is rare and costly. We propose an efficient weakly-supervised positional (WSP) contrastive learning strategy.
arXiv Detail & Related papers (2023-07-10T15:02:13Z)
Improving Opinion-based Question Answering Systems Through Label Error Detection and Overwrite [4.894035903847371]
We propose LEDO: a model-agnostic and computationally efficient framework for Label Error Detection and Overwrite. LEDO is based on Monte Carlo Dropout combined with uncertainty metrics, and can be easily generalized to multiple tasks and data sets. Applying LEDO to an industry opinion-based question answering system demonstrates it is effective at improving accuracy in all the core models.
arXiv Detail & Related papers (2023-06-13T02:20:58Z)
Industrial Anomaly Detection and Localization Using Weakly-Supervised Residual Transformers [44.344548601242444]
We introduce a novel framework, Weakly-supervised RESidual Transformer (WeakREST), to achieve high anomaly detection accuracy.<n>We reformulate the pixel-wise anomaly localization task into a block-wise classification problem.<n>We develop a novel ResMixMatch algorithm, capable of handling the interplay between weak labels and residual-based representations.
arXiv Detail & Related papers (2023-06-06T08:19:30Z)
All Points Matter: Entropy-Regularized Distribution Alignment for Weakly-supervised 3D Segmentation [67.30502812804271]
Pseudo-labels are widely employed in weakly supervised 3D segmentation tasks where only sparse ground-truth labels are available for learning. We propose a novel learning strategy to regularize the generated pseudo-labels and effectively narrow the gaps between pseudo-labels and model predictions.
arXiv Detail & Related papers (2023-05-25T08:19:31Z)
GLENet: Boosting 3D Object Detectors with Generative Label Uncertainty Estimation [70.75100533512021]
In this paper, we formulate the label uncertainty problem as the diversity of potentially plausible bounding boxes of objects. We propose GLENet, a generative framework adapted from conditional variational autoencoders, to model the one-to-many relationship between a typical 3D object and its potential ground-truth bounding boxes with latent variables. The label uncertainty generated by GLENet is a plug-and-play module and can be conveniently integrated into existing deep 3D detectors.
arXiv Detail & Related papers (2022-07-06T06:26:17Z)
3D/2D regularized CNN feature hierarchy for Hyperspectral image classification [1.2359001424473932]
Convolutional Neural Networks (CNN) have been rigorously studied for Hyperspectral Image Classification (HSIC) We propose an idea to enhance the generalization performance of a hybrid CNN for HSIC using soft labels. We empirically show that in improving generalization performance, label smoothing also improves model calibration.
arXiv Detail & Related papers (2021-04-25T11:26:56Z)
Danish Fungi 2020 -- Not Just Another Image Recognition Dataset [0.0]
We introduce a novel fine-grained dataset and benchmark, the Danish Fungi 2020 (DF20) The dataset is constructed from observations submitted to the Danish Fungal Atlas. DF20 has zero overlap with ImageNet, allowing unbiased comparison of models fine-tuned from publicly available ImageNet checkpoints.
arXiv Detail & Related papers (2021-03-18T09:33:11Z)
Semi-supervised deep learning based on label propagation in a 2D embedded space [117.9296191012968]
Proposed solutions propagate labels from a small set of supervised images to a large set of unsupervised ones to train a deep neural network model. We present a loop in which a deep neural network (VGG-16) is trained from a set with more correctly labeled samples along iterations. As the labeled set improves along iterations, it improves the features of the neural network.
arXiv Detail & Related papers (2020-08-02T20:08:54Z)
Temporal Calibrated Regularization for Robust Noisy Label Learning [60.90967240168525]
Deep neural networks (DNNs) exhibit great success on many tasks with the help of large-scale well annotated datasets. However, labeling large-scale data can be very costly and error-prone so that it is difficult to guarantee the annotation quality. We propose a Temporal Calibrated Regularization (TCR) in which we utilize the original labels and the predictions in the previous epoch together.
arXiv Detail & Related papers (2020-07-01T04:48:49Z)

This list is automatically generated from the titles and abstracts of the papers in this site.