RAG-HAR: Retrieval Augmented Generation-based Human Activity Recognition
- URL: http://arxiv.org/abs/2512.08984v1
- Date: Sat, 06 Dec 2025 01:53:02 GMT
- Title: RAG-HAR: Retrieval Augmented Generation-based Human Activity Recognition
- Authors: Nirhoshan Sivaroopan, Hansi Karunarathna, Chamara Madarasingha, Anura Jayasumana, Kanchana Thilakarathna,
- Abstract summary: We introduce RAG-HAR, a training-free retrieval-augmented framework that leverages large language models (LLMs) for Human Activity Recognition (HAR)<n>RAG-HAR computes lightweight statistical descriptors, retrieves semantically similar samples from a vector database, and uses this contextual evidence to make LLM-based activity identification.
- Score: 5.089700375729287
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Human Activity Recognition (HAR) underpins applications in healthcare, rehabilitation, fitness tracking, and smart environments, yet existing deep learning approaches demand dataset-specific training, large labeled corpora, and significant computational resources.We introduce RAG-HAR, a training-free retrieval-augmented framework that leverages large language models (LLMs) for HAR. RAG-HAR computes lightweight statistical descriptors, retrieves semantically similar samples from a vector database, and uses this contextual evidence to make LLM-based activity identification. We further enhance RAG-HAR by first applying prompt optimization and introducing an LLM-based activity descriptor that generates context-enriched vector databases for delivering accurate and highly relevant contextual information. Along with these mechanisms, RAG-HAR achieves state-of-the-art performance across six diverse HAR benchmarks. Most importantly, RAG-HAR attains these improvements without requiring model training or fine-tuning, emphasizing its robustness and practical applicability. RAG-HAR moves beyond known behaviors, enabling the recognition and meaningful labelling of multiple unseen human activities.
Related papers
- Multi-hop Reasoning via Early Knowledge Alignment [68.28168992785896]
Early Knowledge Alignment (EKA) aims to align Large Language Models with contextually relevant retrieved knowledge.<n>EKA significantly improves retrieval precision, reduces cascading errors, and enhances both performance and efficiency.<n>EKA proves effective as a versatile, training-free inference strategy that scales seamlessly to large models.
arXiv Detail & Related papers (2025-12-23T08:14:44Z) - On-device Large Multi-modal Agent for Human Activity Recognition [1.9342524451932614]
Human Activity Recognition (HAR) has been an active area of research, with applications ranging from healthcare to smart environments.<n>Recent advancements in Large Language Models (LLMs) have opened new possibilities to leverage their capabilities in HAR.<n>We present a Large Multi-Modal Agent designed for HAR, which integrates the power of LLMs to enhance both performance and user engagement.
arXiv Detail & Related papers (2025-12-17T22:05:05Z) - Efficient Online Continual Learning in Sensor-Based Human Activity Recognition [8.720698253117837]
PTRN-HAR pre-trains the feature extractor using contrastive loss with a limited amount of data.<n>This paper introduces PTRN-HAR, the first successful application of PTM-based OCL to sensor-based HAR.
arXiv Detail & Related papers (2025-11-04T08:48:36Z) - MARAG-R1: Beyond Single Retriever via Reinforcement-Learned Multi-Tool Agentic Retrieval [50.30107119622642]
Large Language Models (LLMs) excel at reasoning and generation but are inherently limited by static pretraining data.<n>Retrieval-Augmented Generation (RAG) addresses this issue by grounding LLMs in external knowledge.<n>MarAG-R1 is a reinforcement-learned multi-tool RAG framework that enables LLMs to dynamically coordinate multiple retrieval mechanisms.
arXiv Detail & Related papers (2025-10-31T15:51:39Z) - GRIL: Knowledge Graph Retrieval-Integrated Learning with Large Language Models [59.72897499248909]
We propose a novel graph retriever trained end-to-end with Large Language Models (LLMs)<n>Within the extracted subgraph, structural knowledge and semantic features are encoded via soft tokens and the verbalized graph, respectively, which are infused into the LLM together.<n>Our approach consistently achieves state-of-the-art performance, validating the strength of joint graph-LLM optimization for complex reasoning tasks.
arXiv Detail & Related papers (2025-09-20T02:38:00Z) - Towards Generalizable Human Activity Recognition: A Survey [4.08377734173712]
IMU-based Human Activity Recognition (HAR) has attracted increasing attention from both academia and industry in recent years.<n>HAR performance has improved considerably in specific scenarios, but its generalization capability remains a key barrier to widespread real-world adoption.<n>In this survey, we explore the rapidly evolving field of IMU-based generalizable HAR, reviewing 229 research papers alongside 25 publicly available datasets.
arXiv Detail & Related papers (2025-08-17T03:04:39Z) - Attributing Response to Context: A Jensen-Shannon Divergence Driven Mechanistic Study of Context Attribution in Retrieval-Augmented Generation [52.3707788779464]
We introduce a novel Jensen-Shannon Divergence driven method to Attribute Response to Context (ARC-JSD)<n>ARC-JSD enables efficient and accurate identification of essential context sentences without additional fine-tuning, gradient-calculation or surrogate modelling.<n> Evaluations on a wide range of RAG benchmarks, such as TyDi QA, Hotpot QA, and Musique, using instruction-tuned LLMs in different scales demonstrate superior accuracy and significant computational efficiency improvements.
arXiv Detail & Related papers (2025-05-22T09:04:03Z) - Exploring Training and Inference Scaling Laws in Generative Retrieval [50.82554729023865]
Generative retrieval reformulates retrieval as an autoregressive generation task, where large language models generate target documents directly from a query.<n>We systematically investigate training and inference scaling laws in generative retrieval, exploring how model size, training data scale, and inference-time compute jointly influence performance.
arXiv Detail & Related papers (2025-03-24T17:59:03Z) - Learning Task Representations from In-Context Learning [67.66042137487287]
Large language models (LLMs) have demonstrated remarkable proficiency in in-context learning (ICL)<n>We introduce an automated formulation for encoding task information in ICL prompts as a function of attention heads.<n>The proposed method successfully extracts task-specific information from in-context demonstrations and excels in both text and regression tasks.
arXiv Detail & Related papers (2025-02-08T00:16:44Z) - MLLM as Retriever: Interactively Learning Multimodal Retrieval for Embodied Agents [28.419007116364668]
MLLM agents demonstrate potential for complex embodied tasks by retrieving multimodal task-relevant trajectory data.<n>Current retrieval methods primarily focus on surface-level similarities of textual or visual cues in trajectories, neglecting their effectiveness for the specific task at hand.<n>We propose a novel method, MLLM As ReTriever (MART), which enhances the performance of embodied agents by utilizing interaction data.
arXiv Detail & Related papers (2024-10-04T14:10:39Z) - Aligning Data Selection with Performance: Performance-driven Reinforcement Learning for Active Learning in Object Detection [31.304039641225504]
This paper introduces Mean-AP Guided Reinforced Active Learning for Object Detection (MGRAL)<n>MGRAL is a novel approach that leverages the concept of expected model output changes as informativeness for deep detection networks.<n>Our approach demonstrates strong performance, establishing a new paradigm in reinforcement learning-based active learning for object detection.
arXiv Detail & Related papers (2023-10-12T14:59:22Z) - ALP: Action-Aware Embodied Learning for Perception [60.64801970249279]
We introduce Action-Aware Embodied Learning for Perception (ALP)
ALP incorporates action information into representation learning through a combination of optimizing a reinforcement learning policy and an inverse dynamics prediction objective.
We show that ALP outperforms existing baselines in several downstream perception tasks.
arXiv Detail & Related papers (2023-06-16T21:51:04Z) - Sensor Data for Human Activity Recognition: Feature Representation and
Benchmarking [27.061240686613182]
The field of Human Activity Recognition (HAR) focuses on obtaining and analysing data captured from monitoring devices (e.g. sensors)
We address the issue of accurately recognising human activities using different Machine Learning (ML) techniques.
arXiv Detail & Related papers (2020-05-15T00:46:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.