End-to-end Person Search Sequentially Trained on Aggregated Dataset
- URL: http://arxiv.org/abs/2201.09604v1
- Date: Mon, 24 Jan 2022 11:22:15 GMT
- Title: End-to-end Person Search Sequentially Trained on Aggregated Dataset
- Authors: Angelique Loesch and Jaonary Rabarisoa and Romaric Audigier
- Abstract summary: We propose a new end-to-end model that jointly computes detection and feature extraction steps.
We show that aggregating more pedestrian detection datasets without costly identity annotations makes the shared feature maps more generic.
- Score: 1.9766522384767227
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In video surveillance applications, person search is a challenging task
consisting in detecting people and extracting features from their silhouette
for re-identification (re-ID) purpose. We propose a new end-to-end model that
jointly computes detection and feature extraction steps through a single deep
Convolutional Neural Network architecture. Sharing feature maps between the two
tasks for jointly describing people commonalities and specificities allows
faster runtime, which is valuable in real-world applications. In addition to
reaching state-of-the-art accuracy, this multi-task model can be sequentially
trained task-by-task, which results in a broader acceptance of input dataset
types. Indeed, we show that aggregating more pedestrian detection datasets
without costly identity annotations makes the shared feature maps more generic,
and improves re-ID precision. Moreover, these boosted shared feature maps
result in re-ID features more robust to a cross-dataset scenario.
Related papers
- Single-Shot and Multi-Shot Feature Learning for Multi-Object Tracking [55.13878429987136]
We propose a simple yet effective two-stage feature learning paradigm to jointly learn single-shot and multi-shot features for different targets.
Our method has achieved significant improvements on MOT17 and MOT20 datasets while reaching state-of-the-art performance on DanceTrack dataset.
arXiv Detail & Related papers (2023-11-17T08:17:49Z) - A Dynamic Feature Interaction Framework for Multi-task Visual Perception [100.98434079696268]
We devise an efficient unified framework to solve multiple common perception tasks.
These tasks include instance segmentation, semantic segmentation, monocular 3D detection, and depth estimation.
Our proposed framework, termed D2BNet, demonstrates a unique approach to parameter-efficient predictions for multi-task perception.
arXiv Detail & Related papers (2023-06-08T09:24:46Z) - Continual Object Detection via Prototypical Task Correlation Guided
Gating Mechanism [120.1998866178014]
We present a flexible framework for continual object detection via pRotOtypical taSk corrElaTion guided gaTingAnism (ROSETTA)
Concretely, a unified framework is shared by all tasks while task-aware gates are introduced to automatically select sub-models for specific tasks.
Experiments on COCO-VOC, KITTI-Kitchen, class-incremental detection on VOC and sequential learning of four tasks show that ROSETTA yields state-of-the-art performance.
arXiv Detail & Related papers (2022-05-06T07:31:28Z) - Correlation-Aware Deep Tracking [83.51092789908677]
We propose a novel target-dependent feature network inspired by the self-/cross-attention scheme.
Our network deeply embeds cross-image feature correlation in multiple layers of the feature network.
Our model can be flexibly pre-trained on abundant unpaired images, leading to notably faster convergence than the existing methods.
arXiv Detail & Related papers (2022-03-03T11:53:54Z) - Sequential End-to-end Network for Efficient Person Search [7.3658840620058115]
Person search aims at jointly solving Person Detection and Person Re-identification (re-ID)
Existing works have designed end-to-end networks based on Faster R-CNN.
We propose a Sequential End-to-end Network (SeqNet) to extract superior features.
arXiv Detail & Related papers (2021-03-18T10:28:24Z) - Decoupled and Memory-Reinforced Networks: Towards Effective Feature
Learning for One-Step Person Search [65.51181219410763]
One-step methods have been developed to handle pedestrian detection and identification sub-tasks using a single network.
There are two major challenges in the current one-step approaches.
We propose a decoupled and memory-reinforced network (DMRNet) to overcome these problems.
arXiv Detail & Related papers (2021-02-22T06:19:45Z) - Multi-object Tracking with a Hierarchical Single-branch Network [31.680667324595557]
We propose an online multi-object tracking framework based on a hierarchical single-branch network.
Our novel iHOIM loss function unifies the objectives of the two sub-tasks and encourages better detection performance.
Experimental results on MOT16 and MOT20 datasets show that we can achieve state-of-the-art tracking performance.
arXiv Detail & Related papers (2021-01-06T12:14:58Z) - A Tree-structure Convolutional Neural Network for Temporal Features
Exaction on Sensor-based Multi-resident Activity Recognition [4.619245607612873]
We propose an end-to-end Tree-Structure Convolutional neural network based framework for Multi-Resident Activity Recognition (TSC-MRAR)
First, we treat each sample as an event and obtain the current event embedding through the previous sensor readings in the sliding window.
Then, in order to automatically generate the temporal features, a tree-structure network is designed to derive the temporal dependence of nearby readings.
arXiv Detail & Related papers (2020-11-05T14:31:00Z) - Deep Learning based Person Re-identification [2.9631016562930546]
We propose an efficient hierarchical re-identification approach in which color histogram based comparison is first employed to find the closest matches in the gallery set.
A silhouette part-based feature extraction scheme is adopted in each level of hierarchy to preserve the relative locations of the different body structures.
Results reveal that it outperforms most state-of-the-art approaches in terms of overall accuracy.
arXiv Detail & Related papers (2020-05-07T07:30:28Z) - FairMOT: On the Fairness of Detection and Re-Identification in Multiple
Object Tracking [92.48078680697311]
Multi-object tracking (MOT) is an important problem in computer vision.
We present a simple yet effective approach termed as FairMOT based on the anchor-free object detection architecture CenterNet.
The approach achieves high accuracy for both detection and tracking.
arXiv Detail & Related papers (2020-04-04T08:18:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.