Multidimensional Human Activity Recognition With Large Language Model: A Conceptual Framework
- URL: http://arxiv.org/abs/2410.03546v1
- Date: Mon, 16 Sep 2024 21:36:23 GMT
- Title: Multidimensional Human Activity Recognition With Large Language Model: A Conceptual Framework
- Authors: Syed Mhamudul Hasan,
- Abstract summary: In high-stake environments like emergency response or elder care, the integration of large language model (LLM) revolutionizes risk assessment, resource allocation, and emergency responses.
We propose a conceptual framework that utilizes various wearable devices, each considered as a single dimension, to support a multidimensional learning approach within Human Activity Recognition (HAR) systems.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In high-stake environments like emergency response or elder care, the integration of large language model (LLM), revolutionize risk assessment, resource allocation, and emergency responses in Human Activity Recognition (HAR) systems by leveraging data from various wearable sensors. We propose a conceptual framework that utilizes various wearable devices, each considered as a single dimension, to support a multidimensional learning approach within HAR systems. By integrating and processing data from these diverse sources, LLMs can process and translate complex sensor inputs into actionable insights. This integration mitigates the inherent uncertainties and complexities associated with them, and thus enhancing the responsiveness and effectiveness of emergency services. This paper sets the stage for exploring the transformative potential of LLMs within HAR systems in empowering emergency workers to navigate the unpredictable and risky environments they encounter in their critical roles.
Related papers
- RA-BLIP: Multimodal Adaptive Retrieval-Augmented Bootstrapping Language-Image Pre-training [55.54020926284334]
Multimodal Large Language Models (MLLMs) have recently received substantial interest, which shows their emerging potential as general-purpose models for various vision-language tasks.
Retrieval augmentation techniques have proven to be effective plugins for both LLMs and MLLMs.
In this study, we propose multimodal adaptive Retrieval-Augmented Bootstrapping Language-Image Pre-training (RA-BLIP), a novel retrieval-augmented framework for various MLLMs.
arXiv Detail & Related papers (2024-10-18T03:45:19Z) - Coherence-Driven Multimodal Safety Dialogue with Active Learning for Embodied Agents [23.960719833886984]
M-CoDAL is a multimodal-dialogue system specifically designed for embodied agents to better understand and communicate in safety-critical situations.
Our approach is evaluated using a newly created multimodal dataset comprising 1K safety violations extracted from 2K Reddit images.
Results with this dataset demonstrate that our approach improves resolution of safety situations, user sentiment, as well as safety of the conversation.
arXiv Detail & Related papers (2024-10-18T03:26:06Z) - Selective Exploration and Information Gathering in Search and Rescue Using Hierarchical Learning Guided by Natural Language Input [5.522800137785975]
We introduce a system that integrates social interaction via large language models (LLMs) with a hierarchical reinforcement learning (HRL) framework.
The proposed system is designed to translate verbal inputs from human stakeholders into actionable RL insights and adjust its search strategy.
By leveraging human-provided information through LLMs and structuring task execution through HRL, our approach significantly improves the agent's learning efficiency and decision-making process in environments characterised by long horizons and sparse rewards.
arXiv Detail & Related papers (2024-09-20T12:27:47Z) - Cooperative Resilience in Artificial Intelligence Multiagent Systems [2.0608564715600273]
This paper proposes a clear definition of cooperative resilience' and a methodology for its quantitative measurement.
The results highlight the crucial role of resilience metrics in analyzing how the collective system prepares for, resists, recovers from, sustains well-being, and transforms in the face of disruptions.
arXiv Detail & Related papers (2024-09-20T03:28:48Z) - A Study on Prompt Injection Attack Against LLM-Integrated Mobile Robotic Systems [4.71242457111104]
Large Language Models (LLMs) can process multi-modal prompts, enabling them to generate more context-aware responses.
One of the primary concerns is the potential security risks associated with using LLMs in robotic navigation tasks.
This study investigates the impact of prompt injections on mobile robot performance in LLM-integrated systems.
arXiv Detail & Related papers (2024-08-07T02:48:22Z) - A Comprehensive Review of Multimodal Large Language Models: Performance and Challenges Across Different Tasks [74.52259252807191]
Multimodal Large Language Models (MLLMs) address the complexities of real-world applications far beyond the capabilities of single-modality systems.
This paper systematically sorts out the applications of MLLM in multimodal tasks such as natural language, vision, and audio.
arXiv Detail & Related papers (2024-08-02T15:14:53Z) - Towards LLM-Powered Ambient Sensor Based Multi-Person Human Activity Recognition [4.187145402358247]
Human Activity Recognition (HAR) is one of the central problems in fields such as healthcare, elderly care, and security at home.
This paper proposes a system framework called LAHAR, based on large language models.
arXiv Detail & Related papers (2024-06-25T07:41:34Z) - AntEval: Evaluation of Social Interaction Competencies in LLM-Driven
Agents [65.16893197330589]
Large Language Models (LLMs) have demonstrated their ability to replicate human behaviors across a wide range of scenarios.
However, their capability in handling complex, multi-character social interactions has yet to be fully explored.
We introduce the Multi-Agent Interaction Evaluation Framework (AntEval), encompassing a novel interaction framework and evaluation methods.
arXiv Detail & Related papers (2024-01-12T11:18:00Z) - Agent AI: Surveying the Horizons of Multimodal Interaction [83.18367129924997]
"Agent AI" is a class of interactive systems that can perceive visual stimuli, language inputs, and other environmentally-grounded data.
We envision a future where people can easily create any virtual reality or simulated scene and interact with agents embodied within the virtual environment.
arXiv Detail & Related papers (2024-01-07T19:11:18Z) - A Low-rank Matching Attention based Cross-modal Feature Fusion Method for Conversational Emotion Recognition [54.44337276044968]
We introduce a novel and lightweight cross-modal feature fusion method called Low-Rank Matching Attention Method (LMAM)
LMAM effectively captures contextual emotional semantic information in conversations while mitigating the quadratic complexity issue caused by the self-attention mechanism.
Experimental results verify the superiority of LMAM compared with other popular cross-modal fusion methods on the premise of being more lightweight.
arXiv Detail & Related papers (2023-06-16T16:02:44Z) - Sample-Efficient Reinforcement Learning in the Presence of Exogenous
Information [77.19830787312743]
In real-world reinforcement learning applications the learner's observation space is ubiquitously high-dimensional with both relevant and irrelevant information about the task at hand.
We introduce a new problem setting for reinforcement learning, the Exogenous Decision Process (ExoMDP), in which the state space admits an (unknown) factorization into a small controllable component and a large irrelevant component.
We provide a new algorithm, ExoRL, which learns a near-optimal policy with sample complexity in the size of the endogenous component.
arXiv Detail & Related papers (2022-06-09T05:19:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.