Related papers: DomusFM: A Foundation Model for Smart-Home Sensor Data

DomusFM: A Foundation Model for Smart-Home Sensor Data

URL: http://arxiv.org/abs/2602.01910v1
Date: Mon, 02 Feb 2026 10:16:34 GMT
Title: DomusFM: A Foundation Model for Smart-Home Sensor Data
Authors: Michele Fiori, Gabriele Civitarese, Flora D. Salim, Claudio Bettini,
Abstract summary: We introduce DomusFM, the first foundation model specifically designed and pretrained for smart-home sensor data.<n>DomusFM employs a self-supervised dual contrastive learning paradigm to capture both token-level semantic attributes and sequence-level temporal dependencies.<n>Our approach addresses data scarcity while maintaining practical deployability for real-world smart-home systems.
Score: 11.28458211143065
License: http://creativecommons.org/licenses/by-sa/4.0/
Abstract: Smart-home sensor data holds significant potential for several applications, including healthcare monitoring and assistive technologies. Existing approaches, however, face critical limitations. Supervised models require impractical amounts of labeled data. Foundation models for activity recognition focus only on inertial sensors, failing to address the unique characteristics of smart-home binary sensor events: their sparse, discrete nature combined with rich semantic associations. LLM-based approaches, while tested in this domain, still raise several issues regarding the need for natural language descriptions or prompting, and reliance on either external services or expensive hardware, making them infeasible in real-life scenarios due to privacy and cost concerns. We introduce DomusFM, the first foundation model specifically designed and pretrained for smart-home sensor data. DomusFM employs a self-supervised dual contrastive learning paradigm to capture both token-level semantic attributes and sequence-level temporal dependencies. By integrating semantic embeddings from a lightweight language model and specialized encoders for temporal patterns and binary states, DomusFM learns generalizable representations that transfer across environments and tasks related to activity and event analysis. Through leave-one-dataset-out evaluation across seven public smart-home datasets, we demonstrate that DomusFM outperforms state-of-the-art baselines on different downstream tasks, achieving superior performance even with only 5% of labeled training data available for fine-tuning. Our approach addresses data scarcity while maintaining practical deployability for real-world smart-home systems.

Related papers

Forging Spatial Intelligence: A Roadmap of Multi-Modal Data Pre-Training for Autonomous Systems [75.78934957242403]
Self-driving vehicles and drones require true Spatial Intelligence from multi-modal onboard sensor data.<n>This paper presents a framework for multi-modal pre-training, identifying the core set of techniques driving progress toward this goal.
arXiv Detail & Related papers (2025-12-30T17:58:01Z)
From Indoor to Open World: Revealing the Spatial Reasoning Gap in MLLMs [65.04549036809557]
We introduce a benchmark built from pedestrian-perspective videos captured with synchronized stereo cameras, LiDAR, and IMU/GPS sensors.<n>This dataset provides metrically precise 3D information, enabling the automatic generation of spatial reasoning questions.<n> Evaluations reveal that the performance gains observed in structured indoor benchmarks vanish in open-world settings.
arXiv Detail & Related papers (2025-12-22T18:58:12Z)
Learning More with Less: A Generalizable, Self-Supervised Framework for Privacy-Preserving Capacity Estimation with EV Charging Data [84.37348569981307]
We propose a first-of-its-kind capacity estimation model based on self-supervised pre-training.<n>Our model consistently outperforms state-of-the-art baselines.
arXiv Detail & Related papers (2025-10-05T08:58:35Z)
DeepFeatIoT: Unifying Deep Learned, Randomized, and LLM Features for Enhanced IoT Time Series Sensor Data Classification in Smart Industries [2.2120045208641184]
Internet of Things (IoT) sensors are ubiquitous technologies deployed across smart cities, industrial sites, and healthcare systems.<n>We propose a novel deep learning model, DeepFeatIoT, which integrates learned local and global features with non-learned randomized convolutional kernel-based features.<n>Our model's effectiveness is demonstrated through its consistent and generalized performance across multiple real-world IoT sensor datasets.
arXiv Detail & Related papers (2025-08-13T03:47:33Z)
Stochastic Encodings for Active Feature Acquisition [100.47043816019888]
Active Feature Acquisition is an instance-wise, sequential decision making problem.<n>The aim is to dynamically select which feature to measure based on current observations, independently for each test instance.<n>Common approaches either use Reinforcement Learning, which experiences training difficulties, or greedily maximize the conditional mutual information of the label and unobserved features, which makes myopic.<n>We introduce a latent variable model, trained in a supervised manner. Acquisitions are made by reasoning about the features across many possible unobserved realizations in a latent space.
arXiv Detail & Related papers (2025-08-03T23:48:46Z)
LSM-2: Learning from Incomplete Wearable Sensor Data [65.58595667477505]
This paper introduces the second generation of Large Sensor Model (LSM-2) with Adaptive and Inherited Masking (AIM)<n>AIM learns robust representations directly from incomplete data without requiring explicit imputation.<n>Our LSM-2 with AIM achieves the best performance across a diverse range of tasks, including classification, regression and generative modeling.
arXiv Detail & Related papers (2025-06-05T17:57:11Z)
Spatial-Temporal-Spectral Unified Modeling for Remote Sensing Dense Prediction [20.1863553357121]
Current deep learning architectures for remote sensing are fundamentally rigid.<n>We introduce the Spatial-Temporal-Spectral Unified Network (STSUN) for unified modeling.<n> STSUN can adapt to input and output data with arbitrary spatial sizes, temporal lengths, and spectral bands.<n>It unifies various dense prediction tasks and diverse semantic class predictions.
arXiv Detail & Related papers (2025-05-18T07:39:17Z)
PIM: Physics-Informed Multi-task Pre-training for Improving Inertial Sensor-Based Human Activity Recognition [4.503003860563811]
We propose a physics-informed multi-task pre-training (PIM) framework for IMU-based human activity recognition (HAR)<n>PIM generates pre-text tasks based on the understanding of basic physical aspects of human motion.<n>We have observed gains of almost 10% in macro f1 score and accuracy with only 2 to 8 labeled examples per class.
arXiv Detail & Related papers (2025-03-23T08:16:01Z)
DeepPerception: Advancing R1-like Cognitive Visual Perception in MLLMs for Knowledge-Intensive Visual Grounding [61.26026947423187]
Human experts excel at fine-grained visual discrimination by leveraging domain knowledge to refine perceptual features.<n>Current Multimodal Large Language Models (MLLMs) struggle to integrate reasoning into visual perception.<n>We propose DeepPerception, an MLLM enhanced with cognitive visual perception capabilities.
arXiv Detail & Related papers (2025-03-17T04:06:34Z)
SensorLLM: Aligning Large Language Models with Motion Sensors for Human Activity Recognition [13.796942110105313]
We introduce SensorLLM, a framework that enables Large Language Models (LLMs) to perform human activity recognition (HAR) from sensor time-series data.<n> SensorLLM addresses limitations through a Sensor-Language Alignment stage, where the model aligns sensor inputs with trend descriptions.<n>In the subsequent Task-Aware Tuning stage, we refine the model for HAR classification, achieving performance that matches or surpasses state-of-the-art methods.
arXiv Detail & Related papers (2024-10-14T15:30:41Z)
SmartPretrain: Model-Agnostic and Dataset-Agnostic Representation Learning for Motion Prediction [37.461695201579914]
We propose SmartPretrain, a general and scalable framework for motion prediction.<n>Our approach integrates contrastive and reconstructive SSL, leveraging the strengths of both generative and discriminative paradigms.<n>SmartPretrain consistently improves the performance of state-of-the-art prediction models across datasets, data splits and main metrics.
arXiv Detail & Related papers (2024-10-11T09:52:26Z)
Layout Agnostic Human Activity Recognition in Smart Homes through Textual Descriptions Of Sensor Triggers (TDOST) [0.22354214294493352]
We develop a layout-agnostic modeling approach for human activity recognition (HAR) systems in smart homes. We generate Textual Descriptions Of Sensor Triggers (TDOST) that encapsulate the surrounding trigger conditions. We demonstrate the effectiveness of TDOST-based models in unseen smart homes through experiments on benchmarked CASAS datasets.
arXiv Detail & Related papers (2024-05-20T20:37:44Z)
Know Thy Neighbors: A Graph Based Approach for Effective Sensor-Based Human Activity Recognition in Smart Homes [0.0]
We propose a novel graph-guided neural network approach for Human Activity Recognition (HAR) in smart homes. We accomplish this by learning a more expressive graph structure representing the sensor network in a smart home. Our approach maps discrete input sensor measurements to a feature space through the application of attention mechanisms.
arXiv Detail & Related papers (2023-11-16T02:43:13Z)

This list is automatically generated from the titles and abstracts of the papers in this site.