Related papers: Inference from Real-World Sparse Measurements

Inference from Real-World Sparse Measurements

URL: http://arxiv.org/abs/2210.11269v7
Date: Mon, 15 Apr 2024 08:24:11 GMT
Title: Inference from Real-World Sparse Measurements
Authors: Arnaud Pannatier, Kyle Matoba, François Fleuret,
Abstract summary: Real-world problems often involve complex and unstructured sets of measurements, which occur when sensors are sparsely placed in either space or time. Deep learning architectures capable of processing sets of measurements with positions varying from set to set and extracting readouts anywhere are methodologically difficult. We propose an attention-based model focused on applicability and practical robustness, with two key design contributions.
Score: 21.194357028394226
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Real-world problems often involve complex and unstructured sets of measurements, which occur when sensors are sparsely placed in either space or time. Being able to model this irregular spatiotemporal data and extract meaningful forecasts is crucial. Deep learning architectures capable of processing sets of measurements with positions varying from set to set, and extracting readouts anywhere are methodologically difficult. Current state-of-the-art models are graph neural networks and require domain-specific knowledge for proper setup. We propose an attention-based model focused on robustness and practical applicability, with two key design contributions. First, we adopt a ViT-like transformer that takes both context points and read-out positions as inputs, eliminating the need for an encoder-decoder structure. Second, we use a unified method for encoding both context and read-out positions. This approach is intentionally straightforward and integrates well with other systems. Compared to existing approaches, our model is simpler, requires less specialized knowledge, and does not suffer from a problematic bottleneck effect, all of which contribute to superior performance. We conduct in-depth ablation studies that characterize this problematic bottleneck in the latent representations of alternative models that inhibit information utilization and impede training efficiency. We also perform experiments across various problem domains, including high-altitude wind nowcasting, two-day weather forecasting, fluid dynamics, and heat diffusion. Our attention-based model consistently outperforms state-of-the-art models in handling irregularly sampled data. Notably, our model reduces the root mean square error (RMSE) for wind nowcasting from 9.24 to 7.98 and for heat diffusion tasks from 0.126 to 0.084.

Related papers

Lightweight Task-Oriented Semantic Communication Empowered by Large-Scale AI Models [66.57755931421285]
Large-scale artificial intelligence (LAI) models pose significant challenges for real-time communication scenarios.<n>This paper proposes utilizing knowledge distillation (KD) techniques to extract and condense knowledge from LAI models.<n>We propose a fast distillation method featuring a pre-stored compression mechanism that eliminates the need for repetitive inference.
arXiv Detail & Related papers (2025-06-16T08:42:16Z)
Filling the Missings: Spatiotemporal Data Imputation by Conditional Diffusion [7.021277706390712]
Missing intemporal systems presents a challenge for modern applications ranging from environmental monitoring to urban traffic management.<n>Current approaches based on machine learning and deep learning struggle to model thedependencies between spatial and temporal dimensions effectively.<n>CoFILL builds on the inherent advantages of diffusion-quality models to generate high-quality imputations.
arXiv Detail & Related papers (2025-06-08T11:53:06Z)
Gated Attention for Large Language Models: Non-linearity, Sparsity, and Attention-Sink-Free [81.65559031466452]
We conduct experiments to investigate gating-augmented softmax attention variants.<n>We find that a simple modification-applying a head-specific sigmoid gate after the Scaled Dot-Product Attention (SDPA)-consistently improves performance.
arXiv Detail & Related papers (2025-05-10T17:15:49Z)
Adaptive State-Space Mamba for Real-Time Sensor Data Anomaly Detection [2.922256022514318]
We propose an emphAdaptive State-Space Mamba framework for real-time sensor data anomaly detection. Our approach is easily to other time-series tasks that demand rapid and reliable detection capabilities.
arXiv Detail & Related papers (2025-03-26T21:37:48Z)
ConsistentFeature: A Plug-and-Play Component for Neural Network Regularization [0.32885740436059047]
Over- parameterized neural network models often lead to significant performance discrepancies between training and test sets. We introduce a simple perspective on overfitting: models learn different representations in different i.i.d. datasets. We propose an adaptive method, ConsistentFeature, that regularizes the model by constraining feature differences across random subsets of the same training set.
arXiv Detail & Related papers (2024-12-02T13:21:31Z)
Kriformer: A Novel Spatiotemporal Kriging Approach Based on Graph Transformers [5.4381914710364665]
This study addresses posed by sparse sensor deployment and unreliable data by framing the problem as an environmental challenge. A graphkriformer model, Kriformer, estimates data at locations without sensors by mining spatial and temporal correlations, even with limited resources.
arXiv Detail & Related papers (2024-09-23T11:01:18Z)
Localized Gaussians as Self-Attention Weights for Point Clouds Correspondence [92.07601770031236]
We investigate semantically meaningful patterns in the attention heads of an encoder-only Transformer architecture. We find that fixing the attention weights not only accelerates the training process but also enhances the stability of the optimization.
arXiv Detail & Related papers (2024-09-20T07:41:47Z)
Open-Set Deepfake Detection: A Parameter-Efficient Adaptation Method with Forgery Style Mixture [58.60915132222421]
We introduce an approach that is both general and parameter-efficient for face forgery detection. We design a forgery-style mixture formulation that augments the diversity of forgery source domains. We show that the designed model achieves state-of-the-art generalizability with significantly reduced trainable parameters.
arXiv Detail & Related papers (2024-08-23T01:53:36Z)
SMILE: Zero-Shot Sparse Mixture of Low-Rank Experts Construction From Pre-Trained Foundation Models [85.67096251281191]
We present an innovative approach to model fusion called zero-shot Sparse MIxture of Low-rank Experts (SMILE) construction. SMILE allows for the upscaling of source models into an MoE model without extra data or further training. We conduct extensive experiments across diverse scenarios, such as image classification and text generation tasks, using full fine-tuning and LoRA fine-tuning.
arXiv Detail & Related papers (2024-08-19T17:32:15Z)
Time Series Representation Models [2.724184832774005]
Time series analysis remains a major challenge due to its sparse characteristics, high dimensionality, and inconsistent data quality. Recent advancements in transformer-based techniques have enhanced capabilities in forecasting and imputation. We propose a new architectural concept for time series analysis based on introspection.
arXiv Detail & Related papers (2024-05-28T13:25:31Z)
Combating Missing Modalities in Egocentric Videos at Test Time [92.38662956154256]
Real-world applications often face challenges with incomplete modalities due to privacy concerns, efficiency needs, or hardware issues. We propose a novel approach to address this issue at test time without requiring retraining. MiDl represents the first self-supervised, online solution for handling missing modalities exclusively at test time.
arXiv Detail & Related papers (2024-04-23T16:01:33Z)
Incremental Online Learning Algorithms Comparison for Gesture and Visual Smart Sensors [68.8204255655161]
This paper compares four state-of-the-art algorithms in two real applications: gesture recognition based on accelerometer data and image classification. Our results confirm these systems' reliability and the feasibility of deploying them in tiny-memory MCUs.
arXiv Detail & Related papers (2022-09-01T17:05:20Z)
Energy networks for state estimation with random sensors using sparse labels [0.0]
We propose a technique with an implicit optimization layer and a physics-based loss function that can learn from sparse labels. Based on this technique we present two models for discrete and continuous prediction in space.
arXiv Detail & Related papers (2022-03-12T15:15:38Z)
TELESTO: A Graph Neural Network Model for Anomaly Classification in Cloud Services [77.454688257702]
Machine learning (ML) and artificial intelligence (AI) are applied on IT system operation and maintenance. One direction aims at the recognition of re-occurring anomaly types to enable remediation automation. We propose a method that is invariant to dimensionality changes of given data.
arXiv Detail & Related papers (2021-02-25T14:24:49Z)
Fast Modeling and Understanding Fluid Dynamics Systems with Encoder-Decoder Networks [0.0]
We show that an accurate deep-learning-based proxy model can be taught efficiently by a finite-volume-based simulator. Compared to traditional simulation, the proposed deep learning approach enables much faster forward computation. We quantify the sensitivity of the deep learning model to key physical parameters and hence demonstrate that the inversion problems can be solved with great acceleration.
arXiv Detail & Related papers (2020-06-09T17:14:08Z)

This list is automatically generated from the titles and abstracts of the papers in this site.