AG-VPReID 2025: Aerial-Ground Video-based Person Re-identification Challenge Results
- URL: http://arxiv.org/abs/2506.22843v1
- Date: Sat, 28 Jun 2025 10:45:30 GMT
- Title: AG-VPReID 2025: Aerial-Ground Video-based Person Re-identification Challenge Results
- Authors: Kien Nguyen, Clinton Fookes, Sridha Sridharan, Huy Nguyen, Feng Liu, Xiaoming Liu, Arun Ross, Dana Michalski, Tamás Endrei, Ivan DeAndres-Tame, Ruben Tolosana, Ruben Vera-Rodriguez, Aythami Morales, Julian Fierrez, Javier Ortega-Garcia, Zijing Gong, Yuhao Wang, Xuehu Liu, Pingping Zhang, Md Rashidunnabi, Hugo Proença, Kailash A. Hambarde, Saeid Rezaei,
- Abstract summary: This paper introduces the AG-VPReID 2025 Challenge - the first large-scale video-based competition focused on high-altitude (80-120m) aerial-ground ReID.<n>The challenge was constructed on the new AG-VPReID dataset with 3,027 identities, over 13,500 tracklets, and approximately 3.7 million frames captured from UAVs, CCTV, and wearable cameras.<n>The leading approach, X-TFCLIP from UAM, attained 72.28% Rank-1 accuracy in the aerial-to-ground ReID setting and 70.77% in the ground-to-aerial ReID setting
- Score: 64.38412449125872
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Person re-identification (ReID) across aerial and ground vantage points has become crucial for large-scale surveillance and public safety applications. Although significant progress has been made in ground-only scenarios, bridging the aerial-ground domain gap remains a formidable challenge due to extreme viewpoint differences, scale variations, and occlusions. Building upon the achievements of the AG-ReID 2023 Challenge, this paper introduces the AG-VPReID 2025 Challenge - the first large-scale video-based competition focused on high-altitude (80-120m) aerial-ground ReID. Constructed on the new AG-VPReID dataset with 3,027 identities, over 13,500 tracklets, and approximately 3.7 million frames captured from UAVs, CCTV, and wearable cameras, the challenge featured four international teams. These teams developed solutions ranging from multi-stream architectures to transformer-based temporal reasoning and physics-informed modeling. The leading approach, X-TFCLIP from UAM, attained 72.28% Rank-1 accuracy in the aerial-to-ground ReID setting and 70.77% in the ground-to-aerial ReID setting, surpassing existing baselines while highlighting the dataset's complexity. For additional details, please refer to the official website at https://agvpreid25.github.io.
Related papers
- AG-VPReID.VIR: Bridging Aerial and Ground Platforms for Video-based Visible-Infrared Person Re-ID [36.00219379027019]
We present AG-VPReID.VIR, the first aerial-ground cross-modality video-based person Re-ID dataset.<n>This dataset captures 1,837 identities across 4,861 tracklets (124,855 frames) using both UAV-mounted and fixed CCTV cameras in RGB and infrared modalities.<n>Our approach bridges the domain gaps between aerial-ground perspectives and RGB-IR modalities, through style-robust feature learning, memory-based cross-view adaptation, and intermediary-guided temporal modeling.
arXiv Detail & Related papers (2025-07-24T00:13:25Z) - MVA 2025 Small Multi-Object Tracking for Spotting Birds Challenge: Dataset, Methods, and Results [15.90859212645041]
This paper introduces the SMOT4SB challenge, which leverages temporal information to address limitations of single-frame detection.<n>Our three main contributions are: (1) the SMOT4SB dataset, consisting of 211 UAV video sequences with 108,192 annotated frames under diverse real-world conditions; (2) SO-HOTA, a novel metric combining Dot Distance with HOTA to mitigate the sensitivity of IoU-based metrics to small displacements; and (3) a competitive MVA2025 challenge with 78 participants and 308 submissions, where the winning method achieved a 5.1x improvement over the baseline.
arXiv Detail & Related papers (2025-07-17T06:45:47Z) - AG-VPReID: A Challenging Large-Scale Benchmark for Aerial-Ground Video-based Person Re-Identification [39.350429734981184]
We introduce AG-VPReID, a new large-scale dataset for aerial-ground video-based person re-identification (ReID)<n>This dataset comprises 6,632 subjects, 32,321 tracklets and over 9.6 million frames captured by drones (altitudes ranging from 15-120m), CCTV, and wearable cameras.<n>We propose AG-VPReID-Net, an end-to-end framework composed of three complementary streams.
arXiv Detail & Related papers (2025-03-11T07:38:01Z) - Unified Physical-Digital Attack Detection Challenge [70.67222784932528]
Face Anti-Spoofing (FAS) is crucial to safeguard Face Recognition (FR) Systems.
UniAttackData is the largest public dataset for Unified Attack Detection.
We organized a Unified Physical-Digital Face Attack Detection Challenge to boost the research in Unified Attack Detections.
arXiv Detail & Related papers (2024-04-09T11:00:11Z) - View-decoupled Transformer for Person Re-identification under Aerial-ground Camera Network [87.36616083812058]
view-decoupled transformer (VDT) is proposed as a simple yet effective framework for aerial-ground person re-identification.
Two major components are designed in VDT to decouple view-related and view-unrelated features.
In addition, we contribute a large-scale AGPReID dataset called CARGO, consisting of five/eight aerial/ground cameras, 5,000 identities, and 108,563 images.
arXiv Detail & Related papers (2024-03-21T16:08:21Z) - AG-ReID.v2: Bridging Aerial and Ground Views for Person Re-identification [39.58286453178339]
Aerial-ground person re-identification (Re-ID) presents unique challenges in computer vision.
We introduce AG-ReID.v2, a dataset specifically designed for person Re-ID in mixed aerial and ground scenarios.
This dataset comprises 100,502 images of 1,615 unique individuals, each annotated with matching IDs and 15 soft attribute labels.
arXiv Detail & Related papers (2024-01-05T04:53:33Z) - Multiview Aerial Visual Recognition (MAVREC): Can Multi-view Improve
Aerial Visual Perception? [57.77643186237265]
We present Multiview Aerial Visual RECognition or MAVREC, a video dataset where we record synchronized scenes from different perspectives.
MAVREC consists of around 2.5 hours of industry-standard 2.7K resolution video sequences, more than 0.5 million frames, and 1.1 million annotated bounding boxes.
This makes MAVREC the largest ground and aerial-view dataset, and the fourth largest among all drone-based datasets.
arXiv Detail & Related papers (2023-12-07T18:59:14Z) - The Second Monocular Depth Estimation Challenge [93.1678025923996]
The second edition of the Monocular Depth Estimation Challenge (MDEC) was open to methods using any form of supervision.
The challenge was based around the SYNS-Patches dataset, which features a wide diversity of environments with high-quality dense ground-truth.
The top supervised submission improved relative F-Score by 27.62%, while the top self-supervised improved it by 16.61%.
arXiv Detail & Related papers (2023-04-14T11:10:07Z) - Aerial-Ground Person Re-ID [43.241435887373804]
We propose a new benchmark dataset - AG-ReID, which performs person re-ID matching in a new setting: across aerial and ground cameras.
Our dataset contains 21,983 images of 388 identities and 15 soft attributes for each identity.
The data was collected by a UAV flying at altitudes between 15 to 45 meters and a ground-based CCTV camera on a university campus.
arXiv Detail & Related papers (2023-03-15T13:07:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.