Related papers: Evidential Active Recognition: Intelligent and Prudent Open-World Embodied Perception

Evidential Active Recognition: Intelligent and Prudent Open-World Embodied Perception

URL: http://arxiv.org/abs/2311.13793v1
Date: Thu, 23 Nov 2023 03:51:46 GMT
Title: Evidential Active Recognition: Intelligent and Prudent Open-World Embodied Perception
Authors: Lei Fan, Mingfu Liang, Yunxuan Li, Gang Hua and Ying Wu
Abstract summary: Active recognition enables robots to explore novel observations, thereby acquiring more information while circumventing undesired viewing conditions. Most recognition modules are developed under the closed-world assumption, which makes them ill-equipped to handle unexpected inputs, such as the absence of the target object in the current observation. We propose treating active recognition as a sequential evidence-gathering process, providing by-step uncertainty and reliable prediction under the evidence combination theory.
Score: 21.639429724987902
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Active recognition enables robots to intelligently explore novel observations, thereby acquiring more information while circumventing undesired viewing conditions. Recent approaches favor learning policies from simulated or collected data, wherein appropriate actions are more frequently selected when the recognition is accurate. However, most recognition modules are developed under the closed-world assumption, which makes them ill-equipped to handle unexpected inputs, such as the absence of the target object in the current observation. To address this issue, we propose treating active recognition as a sequential evidence-gathering process, providing by-step uncertainty quantification and reliable prediction under the evidence combination theory. Additionally, the reward function developed in this paper effectively characterizes the merit of actions when operating in open-world environments. To evaluate the performance, we collect a dataset from an indoor simulator, encompassing various recognition challenges such as distance, occlusion levels, and visibility. Through a series of experiments on recognition and robustness analysis, we demonstrate the necessity of introducing uncertainties to active recognition and the superior performance of the proposed method.

Related papers

Stochastic Encodings for Active Feature Acquisition [100.47043816019888]
Active Feature Acquisition is an instance-wise, sequential decision making problem.<n>The aim is to dynamically select which feature to measure based on current observations, independently for each test instance.<n>Common approaches either use Reinforcement Learning, which experiences training difficulties, or greedily maximize the conditional mutual information of the label and unobserved features, which makes myopic.<n>We introduce a latent variable model, trained in a supervised manner. Acquisitions are made by reasoning about the features across many possible unobserved realizations in a latent space.
arXiv Detail & Related papers (2025-08-03T23:48:46Z)
Advancing Embodied Agent Security: From Safety Benchmarks to Input Moderation [52.83870601473094]
Embodied agents exhibit immense potential across a multitude of domains. Existing research predominantly concentrates on the security of general large language models. This paper introduces a novel input moderation framework, meticulously designed to safeguard embodied agents.
arXiv Detail & Related papers (2025-04-22T08:34:35Z)
Rethinking Top Probability from Multi-view for Distracted Driver Behaviour Localization [6.531367337657802]
Action localization task aims to recognize and comprehend human behaviors and actions from video data captured during real-world driving scenarios. Previous studies have shown great action localization performance by applying a recognition model followed by probability-based post-processing. In this work, we adopt an action recognition model based on self-supervise learning to detect distracted activities and give potential action probabilities.
arXiv Detail & Related papers (2024-11-19T14:18:02Z)
Managing the unknown: a survey on Open Set Recognition and tangential areas [7.345136916791223]
Open Set Recognition models are capable of detecting unknown classes from samples arriving during the testing phase, while maintaining a good level of performance in the classification of samples belonging to known classes. This review comprehensively overviews the recent literature related to Open Set Recognition, identifying common practices, limitations, and connections of this field with other machine learning research areas. Our work also uncovers open problems and suggests several research directions that may motivate and articulate future efforts towards more safe Artificial Intelligence methods.
arXiv Detail & Related papers (2023-12-14T10:08:12Z)
Active Open-Vocabulary Recognition: Let Intelligent Moving Mitigate CLIP Limitations [9.444540281544715]
We introduce a novel agent for active open-vocabulary recognition. The proposed method leverages inter-frame and inter-concept similarities to navigate agent movements and to fuse features, without relying on class-specific knowledge.
arXiv Detail & Related papers (2023-11-28T19:24:07Z)
Unsupervised Self-Driving Attention Prediction via Uncertainty Mining and Knowledge Embedding [51.8579160500354]
We propose an unsupervised way to predict self-driving attention by uncertainty modeling and driving knowledge integration. Results show equivalent or even more impressive performance compared to fully-supervised state-of-the-art approaches.
arXiv Detail & Related papers (2023-03-17T00:28:33Z)
Uncertainty-Aware Lidar Place Recognition in Novel Environments [11.30020653282995]
We investigate the task of uncertainty-aware lidar place recognition. Each predicted place must have an associated uncertainty that can be used to identify and reject incorrect predictions. We introduce a novel evaluation protocol and present the first comprehensive benchmark for this task.
arXiv Detail & Related papers (2022-10-04T04:06:44Z)
A Unified End-to-End Retriever-Reader Framework for Knowledge-based VQA [67.75989848202343]
This paper presents a unified end-to-end retriever-reader framework towards knowledge-based VQA. We shed light on the multi-modal implicit knowledge from vision-language pre-training models to mine its potential in knowledge reasoning. Our scheme is able to not only provide guidance for knowledge retrieval, but also drop these instances potentially error-prone towards question answering.
arXiv Detail & Related papers (2022-06-30T02:35:04Z)
The Familiarity Hypothesis: Explaining the Behavior of Deep Open Set Methods [86.39044549664189]
Anomaly detection algorithms for feature-vector data identify anomalies as outliers, but outlier detection has not worked well in deep learning. This paper proposes the Familiarity Hypothesis that these methods succeed because they are detecting the absence of familiar learned features rather than the presence of novelty. The paper concludes with a discussion of whether familiarity detection is an inevitable consequence of representation learning.
arXiv Detail & Related papers (2022-03-04T18:32:58Z)
Skeleton-Based Mutually Assisted Interacted Object Localization and Human Action Recognition [111.87412719773889]
We propose a joint learning framework for "interacted object localization" and "human action recognition" based on skeleton data. Our method achieves the best or competitive performance with the state-of-the-art methods for human action recognition.
arXiv Detail & Related papers (2021-10-28T10:09:34Z)
Towards Unbiased Visual Emotion Recognition via Causal Intervention [63.74095927462]
We propose a novel Emotion Recognition Network (IERN) to alleviate the negative effects brought by the dataset bias. A series of designed tests validate the effectiveness of IERN, and experiments on three emotion benchmarks demonstrate that IERN outperforms other state-of-the-art approaches.
arXiv Detail & Related papers (2021-07-26T10:40:59Z)
Uncertainty-Aware Vehicle Orientation Estimation for Joint Detection-Prediction Models [12.56249869551208]
Orientation is an important property for downstream modules of an autonomous system. We present a method that extends the existing models that perform joint object detection and motion prediction. In addition, the approach is able to quantify prediction uncertainty, outputting the probability that the inferred orientation is flipped.
arXiv Detail & Related papers (2020-11-05T21:59:44Z)
Uncertainty Quantification for Deep Context-Aware Mobile Activity Recognition and Unknown Context Discovery [85.36948722680822]
We develop a context-aware mixture of deep models termed the alpha-beta network. We improve accuracy and F score by 10% by identifying high-level contexts. In order to ensure training stability, we have used a clustering-based pre-training in both public and in-house datasets.
arXiv Detail & Related papers (2020-03-03T19:35:34Z)

This list is automatically generated from the titles and abstracts of the papers in this site.