Related papers: Bridging Annotation Gaps: Transferring Labels to Align Object Detection Datasets

Bridging Annotation Gaps: Transferring Labels to Align Object Detection Datasets

URL: http://arxiv.org/abs/2506.04737v2
Date: Fri, 06 Jun 2025 06:12:59 GMT
Title: Bridging Annotation Gaps: Transferring Labels to Align Object Detection Datasets
Authors: Mikhail Kennerley, Angelica Aviles-Rivero, Carola-Bibiane Schönlieb, Robby T. Tan,
Abstract summary: Label-Aligned Transfer Proposal (LAT) systematically projects annotations from diverse source datasets into a target label space.<n>LAT achieves consistent improvements in target-domain detection performance, achieving gains of up to +4.8AP over semi-supervised baselines.
Score: 26.566426911250296
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Combining multiple object detection datasets offers a path to improved generalisation but is hindered by inconsistencies in class semantics and bounding box annotations. Some methods to address this assume shared label taxonomies and address only spatial inconsistencies; others require manual relabelling, or produce a unified label space, which may be unsuitable when a fixed target label space is required. We propose Label-Aligned Transfer (LAT), a label transfer framework that systematically projects annotations from diverse source datasets into the label space of a target dataset. LAT begins by training dataset-specific detectors to generate pseudo-labels, which are then combined with ground-truth annotations via a Privileged Proposal Generator (PPG) that replaces the region proposal network in two-stage detectors. To further refine region features, a Semantic Feature Fusion (SFF) module injects class-aware context and features from overlapping proposals using a confidence-weighted attention mechanism. This pipeline preserves dataset-specific annotation granularity while enabling many-to-one label space transfer across heterogeneous datasets, resulting in a semantically and spatially aligned representation suitable for training a downstream detector. LAT thus jointly addresses both class-level misalignments and bounding box inconsistencies without relying on shared label spaces or manual annotations. Across multiple benchmarks, LAT demonstrates consistent improvements in target-domain detection performance, achieving gains of up to +4.8AP over semi-supervised baselines.

Related papers

Semi-Supervised Multi-Label Feature Selection with Consistent Sparse Graph Learning [13.401566810844368]
Existing multi-label methods fail to evaluate the label correlations without enough labeled samples.<n>The similarity graph structure directly derived from the original feature space is suboptimal for multi-label problems.<n>We propose a consistent sparse graph learning method for multi-label semi-supervised feature selection.
arXiv Detail & Related papers (2025-05-23T13:25:41Z)
A Methodological Framework for Measuring Spatial Labeling Similarity [1.5553847214012175]
We provide a framework to transform two spatial labelings into graphs based on location organization, labels, and attributes.<n>The distributions of their graph attributes are then extracted, enabling an efficient reflection of a distributional discrepancy.<n>We show that SLAM provides a comprehensive and accurate computation of labeling quality compared to other well-established evaluation metrics.
arXiv Detail & Related papers (2025-05-20T09:34:03Z)
Reconsidering Feature Structure Information and Latent Space Alignment in Partial Multi-label Feature Selection [3.971316989443196]
The purpose of partial multi-label feature selection is to select the most representative subset, where the data comes from partial multi-label datasets that have label ambiguity issues.<n>Previous methods mainly focus on utilizing the information inside the labels and the relationship between the labels and features.<n>This paper proposes a method based on latent space alignment, which uses the information mined in feature space to disambiguate in latent space.
arXiv Detail & Related papers (2025-03-13T07:21:29Z)
Exploiting Conjugate Label Information for Multi-Instance Partial-Label Learning [61.00359941983515]
Multi-instance partial-label learning (MIPL) addresses scenarios where each training sample is represented as a multi-instance bag associated with a candidate label set containing one true label and several false positives. ELIMIPL exploits the conjugate label information to improve the disambiguation performance.
arXiv Detail & Related papers (2024-08-26T15:49:31Z)
Inter-Domain Mixup for Semi-Supervised Domain Adaptation [108.40945109477886]
Semi-supervised domain adaptation (SSDA) aims to bridge source and target domain distributions, with a small number of target labels available. Existing SSDA work fails to make full use of label information from both source and target domains for feature alignment across domains. This paper presents a novel SSDA approach, Inter-domain Mixup with Neighborhood Expansion (IDMNE), to tackle this issue.
arXiv Detail & Related papers (2024-01-21T10:20:46Z)
Substituting Data Annotation with Balanced Updates and Collective Loss in Multi-label Text Classification [19.592985329023733]
Multi-label text classification (MLTC) is the task of assigning multiple labels to a given text. We study the MLTC problem in annotation-free and scarce-annotation settings in which the magnitude of available supervision signals is linear to the number of labels. Our method follows three steps, (1) mapping input text into a set of preliminary label likelihoods by natural language inference using a pre-trained language model, (2) calculating a signed label dependency graph by label descriptions, and (3) updating the preliminary label likelihoods with message passing along the label dependency graph.
arXiv Detail & Related papers (2023-09-24T04:12:52Z)
Towards Few-shot Entity Recognition in Document Images: A Label-aware Sequence-to-Sequence Framework [28.898240725099782]
We build an entity recognition model requiring only a few shots of annotated document images. We develop a novel label-aware seq2seq framework, LASER. Experiments on two benchmark datasets demonstrate the superiority of LASER under the few-shot setting.
arXiv Detail & Related papers (2022-03-30T18:30:42Z)
Group-aware Label Transfer for Domain Adaptive Person Re-identification [179.816105255584]
Unsupervised Adaptive Domain (UDA) person re-identification (ReID) aims at adapting the model trained on a labeled source-domain dataset to a target-domain dataset without any further annotations. Most successful UDA-ReID approaches combine clustering-based pseudo-label prediction with representation learning and perform the two steps in an alternating fashion. We propose a Group-aware Label Transfer (GLT) algorithm, which enables the online interaction and mutual promotion of pseudo-label prediction and representation learning.
arXiv Detail & Related papers (2021-03-23T07:57:39Z)
Select, Label, and Mix: Learning Discriminative Invariant Feature Representations for Partial Domain Adaptation [55.73722120043086]
We develop a "Select, Label, and Mix" (SLM) framework to learn discriminative invariant feature representations for partial domain adaptation. First, we present a simple yet efficient "select" module that automatically filters out outlier source samples to avoid negative transfer. Second, the "label" module iteratively trains the classifier using both the labeled source domain data and the generated pseudo-labels for the target domain to enhance the discriminability of the latent space.
arXiv Detail & Related papers (2020-12-06T19:29:32Z)
Object Detection with a Unified Label Space from Multiple Datasets [94.33205773893151]
Given multiple datasets with different label spaces, the goal of this work is to train a single object detector predicting over the union of all the label spaces. Consider an object category like faces that is annotated in one dataset, but is not annotated in another dataset. Some categories, like face here, would thus be considered foreground in one dataset, but background in another. We propose loss functions that carefully integrate partial but correct annotations with complementary but noisy pseudo labels.
arXiv Detail & Related papers (2020-08-15T00:51:27Z)
Domain Adaptation with Auxiliary Target Domain-Oriented Classifier [115.39091109079622]
Domain adaptation aims to transfer knowledge from a label-rich but heterogeneous domain to a label-scare domain. One of the most popular SSL techniques is pseudo-labeling that assigns pseudo labels for each unlabeled data. We propose a new pseudo-labeling framework called Auxiliary Target Domain-Oriented (ATDOC) ATDOC alleviates the bias by introducing an auxiliary classifier for target data only, to improve the quality of pseudo labels.
arXiv Detail & Related papers (2020-07-08T15:01:35Z)

This list is automatically generated from the titles and abstracts of the papers in this site.