HomeEmergency -- Using Audio to Find and Respond to Emergencies in the Home
- URL: http://arxiv.org/abs/2504.01089v1
- Date: Tue, 01 Apr 2025 18:07:25 GMT
- Title: HomeEmergency -- Using Audio to Find and Respond to Emergencies in the Home
- Authors: James F. Mullen Jr, Dhruva Kumar, Xuewei Qi, Rajasimman Madhivanan, Arnie Sen, Dinesh Manocha, Richard Kim,
- Abstract summary: In the United States alone accidental home deaths exceed 128,000 per year.<n>Our work aims to enable home robots who respond to emergency scenarios in the home, preventing injuries and deaths.
- Score: 42.18870689560617
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: In the United States alone accidental home deaths exceed 128,000 per year. Our work aims to enable home robots who respond to emergency scenarios in the home, preventing injuries and deaths. We introduce a new dataset of household emergencies based in the ThreeDWorld simulator. Each scenario in our dataset begins with an instantaneous or periodic sound which may or may not be an emergency. The agent must navigate the multi-room home scene using prior observations, alongside audio signals and images from the simulator, to determine if there is an emergency or not. In addition to our new dataset, we present a modular approach for localizing and identifying potential home emergencies. Underpinning our approach is a novel probabilistic dynamic scene graph (P-DSG), where our key insight is that graph nodes corresponding to agents can be represented with a probabilistic edge. This edge, when refined using Bayesian inference, enables efficient and effective localization of agents in the scene. We also utilize multi-modal vision-language models (VLMs) as a component in our approach, determining object traits (e.g. flammability) and identifying emergencies. We present a demonstration of our method completing a real-world version of our task on a consumer robot, showing the transferability of both our task and our method. Our dataset will be released to the public upon this papers publication.
Related papers
- Predictive Probability Density Mapping for Search and Rescue Using An Agent-Based Approach with Sparse Data [0.294944680995069]
We introduce an agent-based model designed to replicate diverse psychological profiles of lost persons.<n>The model allows these agents to navigate real-world landscapes while making decisions autonomously.<n>This work introduces a flexible agent that can be employed in search and rescue operations, offering adaptability across various geographical locations.
arXiv Detail & Related papers (2024-12-17T20:37:26Z) - Hazards in Daily Life? Enabling Robots to Proactively Detect and Resolve Anomalies [26.79399508110069]
We argue that household robots should proactively detect such hazards or anomalies within the home.
We leverage foundational models instead of relying on manually labeled data to build simulated environments.
We demonstrate that our generated environment outperforms others in terms of task description and scene diversity.
arXiv Detail & Related papers (2024-10-16T19:29:14Z) - Simultaneous Localization and Affordance Prediction for Tasks in Egocentric Video [18.14234312389889]
We present a system which trains on spatially-localized egocentric videos in order to connect visual input and task descriptions.
We show our approach outperforms the baseline of using a VLM to map similarity of a task's description over a set of location-tagged images.
The resulting system enables robots to use egocentric sensing to navigate to physical locations of novel tasks specified in natural language.
arXiv Detail & Related papers (2024-07-18T18:55:56Z) - "Don't forget to put the milk back!" Dataset for Enabling Embodied Agents to Detect Anomalous Situations [49.66220439673356]
We have created a new dataset, which we call SafetyDetect.
The SafetyDetect dataset consists of 1000 anomalous home scenes.
Our approach utilizes large language models (LLMs) alongside both a graph representation of the scene and the relationships between the objects in the scene.
arXiv Detail & Related papers (2024-04-12T21:56:21Z) - JRDB-Traj: A Dataset and Benchmark for Trajectory Forecasting in Crowds [79.00975648564483]
Trajectory forecasting models, employed in fields such as robotics, autonomous vehicles, and navigation, face challenges in real-world scenarios.
This dataset provides comprehensive data, including the locations of all agents, scene images, and point clouds, all from the robot's perspective.
The objective is to predict the future positions of agents relative to the robot using raw sensory input data.
arXiv Detail & Related papers (2023-11-05T18:59:31Z) - Active Visual Localization for Multi-Agent Collaboration: A Data-Driven Approach [47.373245682678515]
This work investigates how active visual localization can be used to overcome challenges of viewpoint changes.
Specifically, we focus on the problem of selecting the optimal viewpoint at a given location.
The result demonstrates the superior performance of the data-driven approach when compared to existing methods.
arXiv Detail & Related papers (2023-10-04T08:18:30Z) - Transferring Foundation Models for Generalizable Robotic Manipulation [82.12754319808197]
We propose a novel paradigm that effectively leverages language-reasoning segmentation mask generated by internet-scale foundation models.<n>Our approach can effectively and robustly perceive object pose and enable sample-efficient generalization learning.<n>Demos can be found in our submitted video, and more comprehensive ones can be found in link1 or link2.
arXiv Detail & Related papers (2023-06-09T07:22:12Z) - Dwelling Type Classification for Disaster Risk Assessment Using
Satellite Imagery [3.88838725116957]
Vulnerability and risk assessment of neighborhoods is essential for effective disaster preparedness.
Existing traditional systems, due to dependency on time-consuming and cost-intensive field surveying, do not provide a scalable way to decipher warnings and assess the precise extent of the risk at a hyper-local level.
In this work, machine learning was used to automate the process of identifying dwellings and their type to build a potentially more effective disaster vulnerability assessment system.
arXiv Detail & Related papers (2022-11-16T03:08:15Z) - H-SAUR: Hypothesize, Simulate, Act, Update, and Repeat for Understanding
Object Articulations from Interactions [62.510951695174604]
"Hypothesize, Simulate, Act, Update, and Repeat" (H-SAUR) is a probabilistic generative framework that generates hypotheses about how objects articulate given input observations.
We show that the proposed model significantly outperforms the current state-of-the-art articulated object manipulation framework.
We further improve the test-time efficiency of H-SAUR by integrating a learned prior from learning-based vision models.
arXiv Detail & Related papers (2022-10-22T18:39:33Z) - ProcTHOR: Large-Scale Embodied AI Using Procedural Generation [55.485985317538194]
ProcTHOR is a framework for procedural generation of Embodied AI environments.
We demonstrate state-of-the-art results across 6 embodied AI benchmarks for navigation, rearrangement, and arm manipulation.
arXiv Detail & Related papers (2022-06-14T17:09:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.