Potential Field as Scene Affordance for Behavior Change-Based Visual Risk Object Identification
- URL: http://arxiv.org/abs/2409.15846v1
- Date: Tue, 24 Sep 2024 08:17:50 GMT
- Title: Potential Field as Scene Affordance for Behavior Change-Based Visual Risk Object Identification
- Authors: Pang-Yuan Pao, Shu-Wei Lu, Ze-Yan Lu, Yi-Ting Chen,
- Abstract summary: We study behavior change-based visual risk object identification (Visual-ROI)
Existing methods often show significant limitations in spatial accuracy and temporal consistency.
We propose a new framework with a Bird's Eye View representation to overcome these challenges.
- Score: 4.896236083290351
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We study behavior change-based visual risk object identification (Visual-ROI), a critical framework designed to detect potential hazards for intelligent driving systems. Existing methods often show significant limitations in spatial accuracy and temporal consistency, stemming from an incomplete understanding of scene affordance. For example, these methods frequently misidentify vehicles that do not impact the ego vehicle as risk objects. Furthermore, existing behavior change-based methods are inefficient because they implement causal inference in the perspective image space. We propose a new framework with a Bird's Eye View (BEV) representation to overcome the above challenges. Specifically, we utilize potential fields as scene affordance, involving repulsive forces derived from road infrastructure and traffic participants, along with attractive forces sourced from target destinations. In this work, we compute potential fields by assigning different energy levels according to the semantic labels obtained from BEV semantic segmentation. We conduct thorough experiments and ablation studies, comparing the proposed method with various state-of-the-art algorithms on both synthetic and real-world datasets. Our results show a notable increase in spatial and temporal consistency, with enhancements of 20.3% and 11.6% on the RiskBench dataset, respectively. Additionally, we can improve computational efficiency by 88%. We achieve improvements of 5.4% in spatial accuracy and 7.2% in temporal consistency on the nuScenes dataset.
Related papers
- Perceptual Piercing: Human Visual Cue-based Object Detection in Low Visibility Conditions [2.0409124291940826]
This study proposes a novel deep learning framework inspired by atmospheric scattering and human visual cortex mechanisms to enhance object detection under poor visibility scenarios such as fog, smoke, and haze.
The objective is to enhance the precision and reliability of detection systems under adverse environmental conditions.
arXiv Detail & Related papers (2024-10-02T04:03:07Z) - Implicit Occupancy Flow Fields for Perception and Prediction in
Self-Driving [68.95178518732965]
A self-driving vehicle (SDV) must be able to perceive its surroundings and predict the future behavior of other traffic participants.
Existing works either perform object detection followed by trajectory of the detected objects, or predict dense occupancy and flow grids for the whole scene.
This motivates our unified approach to perception and future prediction that implicitly represents occupancy and flow over time with a single neural network.
arXiv Detail & Related papers (2023-08-02T23:39:24Z) - Adaptive Face Recognition Using Adversarial Information Network [57.29464116557734]
Face recognition models often degenerate when training data are different from testing data.
We propose a novel adversarial information network (AIN) to address it.
arXiv Detail & Related papers (2023-05-23T02:14:11Z) - Unsupervised Self-Driving Attention Prediction via Uncertainty Mining
and Knowledge Embedding [51.8579160500354]
We propose an unsupervised way to predict self-driving attention by uncertainty modeling and driving knowledge integration.
Results show equivalent or even more impressive performance compared to fully-supervised state-of-the-art approaches.
arXiv Detail & Related papers (2023-03-17T00:28:33Z) - Symbolic Perception Risk in Autonomous Driving [4.371383574272895]
We develop a novel framework to assess the risk of misperception in a traffic sign classification task.
We consider the problem in an autonomous driving setting, where visual input quality gradually improves.
We show the closed-form representation of the conditional value-at-risk (CVaR) of misperception.
arXiv Detail & Related papers (2023-03-16T15:49:24Z) - Cluster-level pseudo-labelling for source-free cross-domain facial
expression recognition [94.56304526014875]
We propose the first Source-Free Unsupervised Domain Adaptation (SFUDA) method for Facial Expression Recognition (FER)
Our method exploits self-supervised pretraining to learn good feature representations from the target data.
We validate the effectiveness of our method in four adaptation setups, proving that it consistently outperforms existing SFUDA methods when applied to FER.
arXiv Detail & Related papers (2022-10-11T08:24:50Z) - Cycle and Semantic Consistent Adversarial Domain Adaptation for Reducing
Simulation-to-Real Domain Shift in LiDAR Bird's Eye View [110.83289076967895]
We present a BEV domain adaptation method based on CycleGAN that uses prior semantic classification in order to preserve the information of small objects of interest during the domain adaptation process.
The quality of the generated BEVs has been evaluated using a state-of-the-art 3D object detection framework at KITTI 3D Object Detection Benchmark.
arXiv Detail & Related papers (2021-04-22T12:47:37Z) - Collision-Aware Target-Driven Object Grasping in Constrained
Environments [10.934615956723672]
We propose a novel Collision-Aware Reachability Predictor (CARP) for 6-DoF grasping systems.
The CARP learns to estimate the collision-free probabilities for grasp poses and significantly improves grasping in challenging environments.
The experiments in both simulation and the real world show that our approach achieves more than 75% grasping rate on novel objects.
arXiv Detail & Related papers (2021-04-01T21:44:07Z) - BoMuDANet: Unsupervised Adaptation for Visual Scene Understanding in
Unstructured Driving Environments [54.22535063244038]
We present an unsupervised adaptation approach for visual scene understanding in unstructured traffic environments.
Our method is designed for unstructured real-world scenarios with dense and heterogeneous traffic consisting of cars, trucks, two-and three-wheelers, and pedestrians.
arXiv Detail & Related papers (2020-09-22T08:25:44Z) - Scene-Graph Augmented Data-Driven Risk Assessment of Autonomous Vehicle
Decisions [1.4086978333609153]
We propose a novel data-driven approach that uses scene-graphs as intermediate representations.
Our approach includes a Multi-Relation Graph Convolution Network, a Long-Short Term Memory Network, and attention layers for modeling the subjective risk of driving maneuvers.
We show that our approach achieves a higher classification accuracy than the state-of-the-art approach on both large (96.4% vs. 91.2%) and small (91.8% vs. 71.2%)
We also show that our model trained on a synthesized dataset achieves an average accuracy of 87.8% when tested on a real-world dataset.
arXiv Detail & Related papers (2020-08-31T07:41:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.