Related papers: Going Deeper into Recognizing Actions in Dark Environments: A Comprehensive Benchmark Study

Going Deeper into Recognizing Actions in Dark Environments: A Comprehensive Benchmark Study

URL: http://arxiv.org/abs/2202.09545v3
Date: Mon, 30 Oct 2023 17:11:19 GMT
Title: Going Deeper into Recognizing Actions in Dark Environments: A Comprehensive Benchmark Study
Authors: Yuecong Xu, Jianfei Yang, Haozhi Cao, Jianxiong Yin, Zhenghua Chen, Xiaoli Li, Zhengguo Li, Qianwen Xu
Abstract summary: Action recognition in dark environments can be applied to fields such as surveillance and autonomous driving at night. We focus on the task of action recognition in dark environments, which can be applied to fields such as surveillance and autonomous driving at night. We launch the UG2+ Challenge Track 2 (UG2-2) in IEEE CVPR 2021, with a goal of evaluating and advancing the robustness of AR models in dark environments.
Score: 35.53075596912581
License: http://creativecommons.org/licenses/by-sa/4.0/
Abstract: While action recognition (AR) has gained large improvements with the introduction of large-scale video datasets and the development of deep neural networks, AR models robust to challenging environments in real-world scenarios are still under-explored. We focus on the task of action recognition in dark environments, which can be applied to fields such as surveillance and autonomous driving at night. Intuitively, current deep networks along with visual enhancement techniques should be able to handle AR in dark environments, however, it is observed that this is not always the case in practice. To dive deeper into exploring solutions for AR in dark environments, we launched the UG2+ Challenge Track 2 (UG2-2) in IEEE CVPR 2021, with a goal of evaluating and advancing the robustness of AR models in dark environments. The challenge builds and expands on top of a novel ARID dataset, the first dataset for the task of dark video AR, and guides models to tackle such a task in both fully and semi-supervised manners. Baseline results utilizing current AR models and enhancement methods are reported, justifying the challenging nature of this task with substantial room for improvements. Thanks to the active participation from the research community, notable advances have been made in participants' solutions, while analysis of these solutions helped better identify possible directions to tackle the challenge of AR in dark environments.

Related papers

Towards Depth Foundation Model: Recent Trends in Vision-Based Depth Estimation [75.30238170051291]
Depth estimation is a fundamental task in 3D computer vision, crucial for applications such as 3D reconstruction, free-viewpoint rendering, robotics, autonomous driving, and AR/VR technologies.<n>Traditional methods relying on hardware sensors like LiDAR are often limited by high costs, low resolution, and environmental sensitivity, limiting their applicability in real-world scenarios.<n>Recent advances in vision-based methods offer a promising alternative, yet they face challenges in generalization and stability due to either the low-capacity model architectures or the reliance on domain-specific and small-scale datasets.
arXiv Detail & Related papers (2025-07-15T17:59:59Z)
Enhancing Abnormality Identification: Robust Out-of-Distribution Strategies for Deepfake Detection [2.4851820343103035]
We propose two novel Out-Of-Distribution (OOD) detection approaches.<n>The first approach is trained to reconstruct the input image, while the second incorporates an attention mechanism for detecting OODs.<n>Our method achieves promising results in deepfake detection and ranks among the top-performing configurations on the benchmark.
arXiv Detail & Related papers (2025-06-03T13:24:33Z)
Near-Driven Autonomous Rover Navigation in Complex Environments: Extensions to Urban Search-and-Rescue and Industrial Inspection [0.0]
This paper explores the use of an extended neuroevolutionary approach, based on NeuroEvolution of Augmenting Topologies (NEAT), for autonomous robots in dynamic environments associated with hazardous tasks.<n>NEAT-evolved controllers achieve success rates comparable to state-of-the-art deep reinforcement learning methods, with superior structural adaptability.<n>The paper also highlights the benefits of transfer learning among tasks and evaluates the effectiveness of NEAT in complex 3D navigation.
arXiv Detail & Related papers (2025-04-11T20:00:23Z)
OpenEarthSensing: Large-Scale Fine-Grained Benchmark for Open-World Remote Sensing [57.050679160659705]
We introduce textbfOpenEarthSensing (OES), a large-scale fine-grained benchmark for open-world remote sensing.<n>OES includes 189 scene and object categories, covering the vast majority of potential semantic shifts that may occur in the real world.
arXiv Detail & Related papers (2025-02-28T02:49:52Z)
CrossFuse: Learning Infrared and Visible Image Fusion by Cross-Sensor Top-K Vision Alignment and Beyond [45.996901339560566]
Infrared and visible image fusion (IVIF) is increasingly applied in critical fields such as video surveillance and autonomous driving systems. We propose an infrared-visible fusion framework based on Multi-View Augmentation. Our approach significantly enhances the reliability and stability of IVIF tasks in practical applications.
arXiv Detail & Related papers (2025-02-20T12:19:30Z)
A Cross-Scene Benchmark for Open-World Drone Active Tracking [54.235808061746525]
Drone Visual Active Tracking aims to autonomously follow a target object by controlling the motion system based on visual observations. We propose a unified cross-scene cross-domain benchmark for open-world drone active tracking called DAT. We also propose a reinforcement learning-based drone tracking method called R-VAT.
arXiv Detail & Related papers (2024-12-01T09:37:46Z)
Foundation Models for Remote Sensing and Earth Observation: A Survey [101.77425018347557]
This survey systematically reviews the emerging field of Remote Sensing Foundation Models (RSFMs) It begins with an outline of their motivation and background, followed by an introduction of their foundational concepts. We benchmark these models against publicly available datasets, discuss existing challenges, and propose future research directions.
arXiv Detail & Related papers (2024-10-22T01:08:21Z)
QueensCAMP: an RGB-D dataset for robust Visual SLAM [0.0]
We introduce a novel RGB-D dataset designed for evaluating the robustness of VSLAM systems. The dataset comprises real-world indoor scenes with dynamic objects, motion blur, and varying illumination. We offer open-source scripts for injecting camera failures into any images, enabling further customization.
arXiv Detail & Related papers (2024-10-16T12:58:08Z)
Aquatic Navigation: A Challenging Benchmark for Deep Reinforcement Learning [53.3760591018817]
We propose a new benchmarking environment for aquatic navigation using recent advances in the integration between game engines and Deep Reinforcement Learning. Specifically, we focus on PPO, one of the most widely accepted algorithms, and we propose advanced training techniques. Our empirical evaluation shows that a well-designed combination of these ingredients can achieve promising results.
arXiv Detail & Related papers (2024-05-30T23:20:23Z)
Outdoor Environment Reconstruction with Deep Learning on Radio Propagation Paths [5.030571576007511]
This paper proposes a novel approach harnessing ambient wireless signals for outdoor environment reconstruction. By analyzing radio frequency (RF) data, the paper aims to deduce the environmental characteristics and digitally reconstruct the outdoor surroundings. Two DL-driven approaches are evaluated, with performance assessed using metrics like intersection-over-union (IoU), Hausdorff distance, and Chamfer distance.
arXiv Detail & Related papers (2024-02-27T09:11:10Z)
Mobile AR Depth Estimation: Challenges & Prospects -- Extended Version [12.887748044339913]
We investigate the challenges and opportunities of achieving accurate metric depth estimation in mobile AR. We tested four different state-of-the-art monocular depth estimation models on a newly introduced dataset (ARKitScenes) Our research provides promising future directions to explore and solve those challenges.
arXiv Detail & Related papers (2023-10-22T22:47:51Z)
Egocentric RGB+Depth Action Recognition in Industry-Like Settings [50.38638300332429]
Our work focuses on recognizing actions from egocentric RGB and Depth modalities in an industry-like environment. Our framework is based on the 3D Video SWIN Transformer to encode both RGB and Depth modalities effectively. Our method also secured first place at the multimodal action recognition challenge at ICIAP 2023.
arXiv Detail & Related papers (2023-09-25T08:56:22Z)
Computation-efficient Deep Learning for Computer Vision: A Survey [121.84121397440337]
Deep learning models have reached or even exceeded human-level performance in a range of visual perception tasks. Deep learning models usually demand significant computational resources, leading to impractical power consumption, latency, or carbon emissions in real-world scenarios. New research focus is computationally efficient deep learning, which strives to achieve satisfactory performance while minimizing the computational cost during inference.
arXiv Detail & Related papers (2023-08-27T03:55:28Z)
UDTIRI: An Online Open-Source Intelligent Road Inspection Benchmark Suite [21.565438268381467]
We introduce the road pothole detection task, the first online competition published within this benchmark suite. Our benchmark provides a systematic and thorough evaluation of state-of-the-art object detection, semantic segmentation, and instance segmentation networks. By providing algorithms with a more comprehensive understanding of diverse road conditions, we seek to unlock their untapped potential.
arXiv Detail & Related papers (2023-04-18T09:13:52Z)
Meta-UDA: Unsupervised Domain Adaptive Thermal Object Detection using Meta-Learning [64.92447072894055]
Infrared (IR) cameras are robust under adverse illumination and lighting conditions. We propose an algorithm meta-learning framework to improve existing UDA methods. We produce a state-of-the-art thermal detector for the KAIST and DSIAC datasets.
arXiv Detail & Related papers (2021-10-07T02:28:18Z)
A Vision Based Deep Reinforcement Learning Algorithm for UAV Obstacle Avoidance [1.2693545159861856]
We present two techniques for improving exploration for UAV obstacle avoidance. The first is a convergence-based approach that uses convergence error to iterate through unexplored actions and temporal threshold to balance exploration and exploitation. The second is a guidance-based approach which uses a Gaussian mixture distribution to compare previously seen states to a predicted next state in order to select the next action.
arXiv Detail & Related papers (2021-03-11T01:15:26Z)
Counterfactual Vision-and-Language Navigation via Adversarial Path Sampling [65.99956848461915]
Vision-and-Language Navigation (VLN) is a task where agents must decide how to move through a 3D environment to reach a goal. One of the problems of the VLN task is data scarcity since it is difficult to collect enough navigation paths with human-annotated instructions for interactive environments. We propose an adversarial-driven counterfactual reasoning model that can consider effective conditions instead of low-quality augmented data.
arXiv Detail & Related papers (2019-11-17T18:02:51Z)

This list is automatically generated from the titles and abstracts of the papers in this site.