Related papers: Multidimensional Human Activity Recognition With Large Language Model: A Conceptual Framework

Multidimensional Human Activity Recognition With Large Language Model: A Conceptual Framework

URL: http://arxiv.org/abs/2410.03546v1
Date: Mon, 16 Sep 2024 21:36:23 GMT
Title: Multidimensional Human Activity Recognition With Large Language Model: A Conceptual Framework
Authors: Syed Mhamudul Hasan,
Abstract summary: In high-stake environments like emergency response or elder care, the integration of large language model (LLM) revolutionizes risk assessment, resource allocation, and emergency responses. We propose a conceptual framework that utilizes various wearable devices, each considered as a single dimension, to support a multidimensional learning approach within Human Activity Recognition (HAR) systems.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: In high-stake environments like emergency response or elder care, the integration of large language model (LLM), revolutionize risk assessment, resource allocation, and emergency responses in Human Activity Recognition (HAR) systems by leveraging data from various wearable sensors. We propose a conceptual framework that utilizes various wearable devices, each considered as a single dimension, to support a multidimensional learning approach within HAR systems. By integrating and processing data from these diverse sources, LLMs can process and translate complex sensor inputs into actionable insights. This integration mitigates the inherent uncertainties and complexities associated with them, and thus enhancing the responsiveness and effectiveness of emergency services. This paper sets the stage for exploring the transformative potential of LLMs within HAR systems in empowering emergency workers to navigate the unpredictable and risky environments they encounter in their critical roles.

Related papers

Multimodal Large Language Models for Enhanced Traffic Safety: A Comprehensive Review and Future Trends [5.233512464561313]
Traffic safety remains a critical global challenge, with traditional Advanced Driver-Assistance Systems often struggling in dynamic real-world scenarios. This paper reviews the transformative potential of Multimodal Large Language Models (MLLMs) in addressing these limitations. By positioning MLLMs as a cornerstone for next-generation traffic safety systems, this review underscores their potential to revolutionize the field.
arXiv Detail & Related papers (2025-04-21T18:48:35Z)
Exploring the Roles of Large Language Models in Reshaping Transportation Systems: A Survey, Framework, and Roadmap [51.198001060683296]
Large Language Models (LLMs) offer transformative potential to address transportation challenges. This survey first presents LLM4TR, a novel conceptual framework that systematically categorizes the roles of LLMs in transportation. For each role, our review spans diverse applications, from traffic prediction and autonomous driving to safety analytics and urban mobility optimization.
arXiv Detail & Related papers (2025-03-27T11:56:27Z)
Towards Agentic Recommender Systems in the Era of Multimodal Large Language Models [75.4890331763196]
Recent breakthroughs in Large Language Models (LLMs) have led to the emergence of agentic AI systems. LLM-based Agentic RS (LLM-ARS) can offer more interactive, context-aware, and proactive recommendations.
arXiv Detail & Related papers (2025-03-20T22:37:15Z)
Interpretable Concept-based Deep Learning Framework for Multimodal Human Behavior Modeling [5.954573238057435]
EU General Data Protection Regulation requires any high-risk AI systems to be sufficiently interpretable. Existing explainable methods often compromise between interpretability and performance. We propose a novel and generalizable framework, namely the Attention-Guided Concept Model (AGCM) AGCM provides learnable conceptual explanations by identifying what concepts that lead to the predictions and where they are observed.
arXiv Detail & Related papers (2025-02-14T13:15:21Z)
SenseRAG: Constructing Environmental Knowledge Bases with Proactive Querying for LLM-Based Autonomous Driving [10.041702058108482]
This study addresses the critical need for enhanced situational awareness in autonomous driving (AD) by leveraging the contextual reasoning capabilities of large language models (LLMs) Unlike traditional perception systems that rely on rigid, label-based annotations, it integrates real-time, multimodal sensor data into a unified, LLMs-readable knowledge base. Experimental results using real-world Vehicle-to-everything (V2X) datasets demonstrate significant improvements in perception and prediction performance.
arXiv Detail & Related papers (2025-01-07T05:15:46Z)
RA-BLIP: Multimodal Adaptive Retrieval-Augmented Bootstrapping Language-Image Pre-training [55.54020926284334]
Multimodal Large Language Models (MLLMs) have recently received substantial interest, which shows their emerging potential as general-purpose models for various vision-language tasks. Retrieval augmentation techniques have proven to be effective plugins for both LLMs and MLLMs. In this study, we propose multimodal adaptive Retrieval-Augmented Bootstrapping Language-Image Pre-training (RA-BLIP), a novel retrieval-augmented framework for various MLLMs.
arXiv Detail & Related papers (2024-10-18T03:45:19Z)
Coherence-Driven Multimodal Safety Dialogue with Active Learning for Embodied Agents [23.960719833886984]
M-CoDAL is a multimodal-dialogue system specifically designed for embodied agents to better understand and communicate in safety-critical situations. Our approach is evaluated using a newly created multimodal dataset comprising 1K safety violations extracted from 2K Reddit images. Results with this dataset demonstrate that our approach improves resolution of safety situations, user sentiment, as well as safety of the conversation.
arXiv Detail & Related papers (2024-10-18T03:26:06Z)
Selective Exploration and Information Gathering in Search and Rescue Using Hierarchical Learning Guided by Natural Language Input [5.522800137785975]
We introduce a system that integrates social interaction via large language models (LLMs) with a hierarchical reinforcement learning (HRL) framework. The proposed system is designed to translate verbal inputs from human stakeholders into actionable RL insights and adjust its search strategy. By leveraging human-provided information through LLMs and structuring task execution through HRL, our approach significantly improves the agent's learning efficiency and decision-making process in environments characterised by long horizons and sparse rewards.
arXiv Detail & Related papers (2024-09-20T12:27:47Z)
Cooperative Resilience in Artificial Intelligence Multiagent Systems [2.0608564715600273]
This paper proposes a clear definition of cooperative resilience' and a methodology for its quantitative measurement. The results highlight the crucial role of resilience metrics in analyzing how the collective system prepares for, resists, recovers from, sustains well-being, and transforms in the face of disruptions.
arXiv Detail & Related papers (2024-09-20T03:28:48Z)
A Study on Prompt Injection Attack Against LLM-Integrated Mobile Robotic Systems [4.71242457111104]
Large Language Models (LLMs) can process multi-modal prompts, enabling them to generate more context-aware responses. One of the primary concerns is the potential security risks associated with using LLMs in robotic navigation tasks. This study investigates the impact of prompt injections on mobile robot performance in LLM-integrated systems.
arXiv Detail & Related papers (2024-08-07T02:48:22Z)
A Comprehensive Review of Multimodal Large Language Models: Performance and Challenges Across Different Tasks [74.52259252807191]
Multimodal Large Language Models (MLLMs) address the complexities of real-world applications far beyond the capabilities of single-modality systems. This paper systematically sorts out the applications of MLLM in multimodal tasks such as natural language, vision, and audio.
arXiv Detail & Related papers (2024-08-02T15:14:53Z)
Towards LLM-Powered Ambient Sensor Based Multi-Person Human Activity Recognition [4.187145402358247]
Human Activity Recognition (HAR) is one of the central problems in fields such as healthcare, elderly care, and security at home. This paper proposes a system framework called LAHAR, based on large language models.
arXiv Detail & Related papers (2024-06-25T07:41:34Z)
AntEval: Evaluation of Social Interaction Competencies in LLM-Driven Agents [65.16893197330589]
Large Language Models (LLMs) have demonstrated their ability to replicate human behaviors across a wide range of scenarios. However, their capability in handling complex, multi-character social interactions has yet to be fully explored. We introduce the Multi-Agent Interaction Evaluation Framework (AntEval), encompassing a novel interaction framework and evaluation methods.
arXiv Detail & Related papers (2024-01-12T11:18:00Z)
Agent AI: Surveying the Horizons of Multimodal Interaction [83.18367129924997]
"Agent AI" is a class of interactive systems that can perceive visual stimuli, language inputs, and other environmentally-grounded data. We envision a future where people can easily create any virtual reality or simulated scene and interact with agents embodied within the virtual environment.
arXiv Detail & Related papers (2024-01-07T19:11:18Z)
A Low-rank Matching Attention based Cross-modal Feature Fusion Method for Conversational Emotion Recognition [54.44337276044968]
We introduce a novel and lightweight cross-modal feature fusion method called Low-Rank Matching Attention Method (LMAM) LMAM effectively captures contextual emotional semantic information in conversations while mitigating the quadratic complexity issue caused by the self-attention mechanism. Experimental results verify the superiority of LMAM compared with other popular cross-modal fusion methods on the premise of being more lightweight.
arXiv Detail & Related papers (2023-06-16T16:02:44Z)
Sample-Efficient Reinforcement Learning in the Presence of Exogenous Information [77.19830787312743]
In real-world reinforcement learning applications the learner's observation space is ubiquitously high-dimensional with both relevant and irrelevant information about the task at hand. We introduce a new problem setting for reinforcement learning, the Exogenous Decision Process (ExoMDP), in which the state space admits an (unknown) factorization into a small controllable component and a large irrelevant component. We provide a new algorithm, ExoRL, which learns a near-optimal policy with sample complexity in the size of the endogenous component.
arXiv Detail & Related papers (2022-06-09T05:19:32Z)

This list is automatically generated from the titles and abstracts of the papers in this site.