Related papers: SensorChat: Answering Qualitative and Quantitative Questions during Long-Term Multimodal Sensor Interactions

SensorChat: Answering Qualitative and Quantitative Questions during Long-Term Multimodal Sensor Interactions

URL: http://arxiv.org/abs/2502.02883v1
Date: Wed, 05 Feb 2025 04:41:59 GMT
Title: SensorChat: Answering Qualitative and Quantitative Questions during Long-Term Multimodal Sensor Interactions
Authors: Xiaofan Yu, Lanxiang Hu, Benjamin Reichman, Dylan Chu, Rushil Chandrupatla, Xiyuan Zhang, Larry Heck, Tajana Rosing,
Abstract summary: We introduce SensorChat, the first end-to-end QA system designed for long-term sensor monitoring.<n> SensorChat effectively answers both qualitative (requiring high-level reasoning) and quantitative (requiring accurate responses from sensor data) questions in real-world scenarios.<n>We implement SensorChat and demonstrate its capability for real-time interactions on a cloud server while also being able to run entirely on edge platforms after quantization.
Score: 7.549011805153971
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Natural language interaction with sensing systems is crucial for enabling all users to comprehend sensor data and its impact on their everyday lives. However, existing systems, which typically operate in a Question Answering (QA) manner, are significantly limited in terms of the duration and complexity of sensor data they can handle. In this work, we introduce SensorChat, the first end-to-end QA system designed for long-term sensor monitoring with multimodal and high-dimensional data including time series. SensorChat effectively answers both qualitative (requiring high-level reasoning) and quantitative (requiring accurate responses derived from sensor data) questions in real-world scenarios. To achieve this, SensorChat uses an innovative three-stage pipeline that includes question decomposition, sensor data query, and answer assembly. The first and third stages leverage Large Language Models (LLMs) for intuitive human interactions and to guide the sensor data query process. Unlike existing multimodal LLMs, SensorChat incorporates an explicit query stage to precisely extract factual information from long-duration sensor data. We implement SensorChat and demonstrate its capability for real-time interactions on a cloud server while also being able to run entirely on edge platforms after quantization. Comprehensive QA evaluations show that SensorChat achieves up to 26% higher answer accuracy than state-of-the-art systems on quantitative questions. Additionally, a user study with eight volunteers highlights SensorChat's effectiveness in handling qualitative and open-ended questions.

Related papers

Gensors: Authoring Personalized Visual Sensors with Multimodal Foundation Models and Reasoning [61.17099595835263]
Gensors is a system that empowers users to define customized sensors supported by the reasoning capabilities of MLLMs.<n>In a user study, participants reported significantly greater sense of control, understanding, and ease of communication when defining sensors using Gensors.
arXiv Detail & Related papers (2025-01-27T01:47:57Z)
SensorQA: A Question Answering Benchmark for Daily-Life Monitoring [1.925154869666529]
SensorQA is the first human-created question-answering dataset for long-term time-series sensor data for daily life monitoring.<n>We establish benchmarks for state-of-the-art AI models on this dataset and evaluate their performance on typical edge devices.<n>Our results reveal a gap between current models and optimal QA performance and efficiency, highlighting the need for new contributions.
arXiv Detail & Related papers (2025-01-09T05:06:44Z)
Show Me What and Where has Changed? Question Answering and Grounding for Remote Sensing Change Detection [82.65760006883248]
We introduce a new task named Change Detection Question Answering and Grounding (CDQAG) CDQAG extends the traditional change detection task by providing interpretable textual answers and intuitive visual evidence. We construct the first CDQAG benchmark dataset, termed QAG-360K, comprising over 360K triplets of questions, textual answers, and corresponding high-quality visual masks.
arXiv Detail & Related papers (2024-10-31T11:20:13Z)
SensorBench: Benchmarking LLMs in Coding-Based Sensor Processing [6.8009140511761546]
Large Language Models (LLMs) have promising capabilities in processing sensory data, suggesting their potential as copilots for developing sensing systems. We construct a comprehensive benchmark, SensorBench, to establish a quantifiable objective. The results show that while LLMs exhibit considerable proficiency in simpler tasks, they face inherent challenges in processing compositional tasks.
arXiv Detail & Related papers (2024-10-14T17:21:39Z)
SensorLLM: Aligning Large Language Models with Motion Sensors for Human Activity Recognition [9.072495000412943]
We bridge the gap between wearable sensor technology and personalized AI assistants by enabling Large Language Models (LLMs) to understand time-series tasks like human activity recognition (HAR) We introduce SensorLLM, a two-stage framework to unlock LLMs' potential for sensor data tasks. We show that SensorLLM evolves into an effective sensor learner, reasoner, and learner, enabling it to generalize across diverse datasets for HAR tasks.
arXiv Detail & Related papers (2024-10-14T15:30:41Z)
Anomaly Detection and Inter-Sensor Transfer Learning on Smart Manufacturing Datasets [6.114996271792091]
In many cases, the goal of the smart manufacturing system is to rapidly detect (or anticipate) failures to reduce operational cost and eliminate downtime. This often boils down to detecting anomalies within the sensor date acquired from the system. The smart manufacturing application domain poses certain salient technical challenges. We show that predictive failure classification can be achieved, thus paving the way for predictive maintenance.
arXiv Detail & Related papers (2022-06-13T17:51:24Z)
Learning Online Multi-Sensor Depth Fusion [100.84519175539378]
SenFuNet is a depth fusion approach that learns sensor-specific noise and outlier statistics. We conduct experiments with various sensor combinations on the real-world CoRBS and Scene3D datasets.
arXiv Detail & Related papers (2022-04-07T10:45:32Z)
Bayesian Imitation Learning for End-to-End Mobile Manipulation [80.47771322489422]
Augmenting policies with additional sensor inputs, such as RGB + depth cameras, is a straightforward approach to improving robot perception capabilities. We show that using the Variational Information Bottleneck to regularize convolutional neural networks improves generalization to held-out domains. We demonstrate that our method is able to help close the sim-to-real gap and successfully fuse RGB and depth modalities.
arXiv Detail & Related papers (2022-02-15T17:38:30Z)
Benchmarking high-fidelity pedestrian tracking systems for research, real-time monitoring and crowd control [55.41644538483948]
High-fidelity pedestrian tracking in real-life conditions has been an important tool in fundamental crowd dynamics research. As this technology advances, it is becoming increasingly useful also in society. To successfully employ pedestrian tracking techniques in research and technology, it is crucial to validate and benchmark them for accuracy. We present and discuss a benchmark suite, towards an open standard in the community, for privacy-respectful pedestrian tracking techniques.
arXiv Detail & Related papers (2021-08-26T11:45:26Z)
SensiX: A Platform for Collaborative Machine Learning on the Edge [69.1412199244903]
We present SensiX, a personal edge platform that stays between sensor data and sensing models. We demonstrate its efficacy in developing motion and audio-based multi-device sensing systems. Our evaluation shows that SensiX offers a 7-13% increase in overall accuracy and up to 30% increase across different environment dynamics at the expense of 3mW power overhead.
arXiv Detail & Related papers (2020-12-04T23:06:56Z)
Ant Colony Inspired Machine Learning Algorithm for Identifying and Emulating Virtual Sensors [0.0]
It should be possible to emulate the output of certain sensors based on other sensors. In order to identify the subset of sensors whose readings can be emulated, the sensors must be grouped into clusters. This paper proposes an end-to-end algorithmic solution, to realise virtual sensors in such systems.
arXiv Detail & Related papers (2020-11-02T09:06:14Z)

This list is automatically generated from the titles and abstracts of the papers in this site.