Hyperspectral Variational Autoencoders for Joint Data Compression and Component Extraction
- URL: http://arxiv.org/abs/2511.18521v1
- Date: Sun, 23 Nov 2025 16:26:09 GMT
- Title: Hyperspectral Variational Autoencoders for Joint Data Compression and Component Extraction
- Authors: Core Francisco Park, Manuel Perez-Carrasco, Caroline Nowlan, Cecilia Garraffo,
- Abstract summary: We present a variational autoencoder (VAE) approach that achieves x514 compression of NASA's TEMPO satellite hyperspectral observations (1028 channels, 290-490nm)<n>This dramatic data volume reduction enables efficient archival and sharing of satellite observations while preserving spectral fidelity.
- Score: 5.157309015481045
- License: http://creativecommons.org/publicdomain/zero/1.0/
- Abstract: Geostationary hyperspectral satellites generate terabytes of data daily, creating critical challenges for storage, transmission, and distribution to the scientific community. We present a variational autoencoder (VAE) approach that achieves x514 compression of NASA's TEMPO satellite hyperspectral observations (1028 channels, 290-490nm) with reconstruction errors 1-2 orders of magnitude below the signal across all wavelengths. This dramatic data volume reduction enables efficient archival and sharing of satellite observations while preserving spectral fidelity. Beyond compression, we investigate to what extent atmospheric information is retained in the compressed latent space by training linear and nonlinear probes to extract Level-2 products (NO2, O3, HCHO, cloud fraction). Cloud fraction and total ozone achieve strong extraction performance (R^2 = 0.93 and 0.81 respectively), though these represent relatively straightforward retrievals given their distinct spectral signatures. In contrast, tropospheric trace gases pose genuine challenges for extraction (NO2 R^2 = 0.20, HCHO R^2 = 0.51) reflecting their weaker signals and complex atmospheric interactions. Critically, we find the VAE encodes atmospheric information in a semi-linear manner - nonlinear probes substantially outperform linear ones - and that explicit latent supervision during training provides minimal improvement, revealing fundamental encoding challenges for certain products. This work demonstrates that neural compression can dramatically reduce hyperspectral data volumes while preserving key atmospheric signals, addressing a critical bottleneck for next-generation Earth observation systems. Code - https://github.com/cfpark00/Hyperspectral-VAE
Related papers
- Function-Space Decoupled Diffusion for Forward and Inverse Modeling in Carbon Capture and Storage [65.51149575007149]
We present Fun-DDPS, a generative framework that combines function-space diffusion models with differentiable neural operator surrogates for both forward and inverse modeling.<n>Fun-DDPS produces physically consistent realizations free from the high-frequency artifacts observed in joint-state baselines.
arXiv Detail & Related papers (2026-02-12T18:58:12Z) - Efficient reduction of stellar contamination and noise in planetary transmission spectra using neural networks [0.0]
We present a methodology to reduce stellar contamination and instrument-specific noise in exoplanet spectra using denoising autoencoders.<n>We design and train denoising autoencoder architectures on large synthetic datasets of terrestrial (TRAPPIST-1e analogues) and sub-Neptune (K2-18b analogues) planets.
arXiv Detail & Related papers (2026-02-10T22:07:18Z) - VAE with Hyperspherical Coordinates: Improving Anomaly Detection from Hypervolume-Compressed Latent Space [56.362776482614976]
Variational autoencoders (VAE) encode data into lower-dimensional latent vectors before decoding those vectors back to data.<n>We propose to formulate the latent variables of a VAE using hyperspherical coordinates, which allows compressing the latent vectors towards a given direction on the hypersphere.<n>We show that this improves both the fully unsupervised and OOD anomaly detection ability of the VAE, achieving the best performance on the datasets we considered.
arXiv Detail & Related papers (2026-01-25T03:10:24Z) - Hunting for "Oddballs" with Machine Learning: Detecting Anomalous Exoplanets Using a Deep-Learned Low-Dimensional Representation of Transit Spectra with Autoencoders [35.61099185492068]
This study explores the application of autoencoder-based machine learning techniques for anomaly detection to identify exoplanet atmospheres with unconventional chemical signatures.<n>We use the Atmospheric Big Challenge (ABC) database to construct an anomaly detection scenario by defining CO2-rich atmospheres as anomalies and CO2-poor atmospheres as the normal class.<n>We benchmarked four different anomaly detection strategies: Autoencoder Reconstruction Loss, One-Class Support Vector Machine (1 class-SVM), K-means Clustering, and Local Outlier Factor (LOF)
arXiv Detail & Related papers (2026-01-05T18:15:53Z) - Isolation-based Spherical Ensemble Representations for Anomaly Detection [60.989157958972356]
Anomaly detection is a critical task in data mining and management with applications spanning fraud detection, network security, and log monitoring.<n>Existing unsupervised anomaly detection methods face fundamental challenges including conflicting distributional assumptions, computational inefficiency, and difficulty handling different anomaly types.<n>We propose ISER (Isolation-based Spherical Ensemble Representations) that extends existing isolation-based methods by using hypersphere radii as proxies for local density characteristics while maintaining linear time and constant space complexity.
arXiv Detail & Related papers (2025-10-15T09:00:05Z) - Data-Driven Reconstruction of Significant Wave Heights from Sparse Observations [3.356199201143573]
We introduce AUWave, a hybrid deep learning framework that fuses a station-wise sequence encoder (MLP) with a multi-scale U-Net.<n>We show that AUWave consistently outperforms a representative baseline in data-richer configurations.<n>The architecture's multi-scale and attention components translate into accuracy gains when minimal but non-trivial spatial anchoring is available.
arXiv Detail & Related papers (2025-09-21T14:12:28Z) - DenoDet V2: Phase-Amplitude Cross Denoising for SAR Object Detection [49.9059941674531]
We propose DenoDet V2, which exploits the complementary nature of amplitude and phase information through a band-wise mutual modulation mechanism.<n>DenoDet V2 achieves a significant 0.8% improvement on SARDet-100K dataset compared to DenoDet V1, while reducing the model complexity by half.
arXiv Detail & Related papers (2025-08-12T23:24:20Z) - Efficient Compression of Sparse Accelerator Data Using Implicit Neural Representations and Importance Sampling [7.838980097597047]
Large-scale particle colliders in nuclear and high-energy physics generate data at extraordinary rates.<n>We propose a novel approach using implicit neural representations for data learning and compression.<n>We also introduce an importance sampling technique to accelerate the network training process.
arXiv Detail & Related papers (2024-12-02T17:50:49Z) - HiHa: Introducing Hierarchical Harmonic Decomposition to Implicit Neural Compression for Atmospheric Data [7.453116854690232]
We propose Hierarchical implicit neural compression (HiHa) for atmospheric data.
HiHa segments the data into multi-frequency signals through decomposition of multiple complex harmonic.
We additionally design a temporal residual compression module to accelerate compression by utilizing temporal continuity.
arXiv Detail & Related papers (2024-11-09T11:44:32Z) - How Important are Data Augmentations to Close the Domain Gap for Object Detection in Orbit? [15.550663626482903]
We investigate the efficacy of data augmentations to close the domain gap in spaceborne computer vision.
We propose two novel data augmentations specifically developed to emulate the visual effects observed in orbital imagery.
arXiv Detail & Related papers (2024-10-21T08:24:46Z) - Real-time gravitational-wave inference for binary neutron stars using machine learning [71.29593576787549]
We present a machine learning framework that performs complete BNS inference in just one second without making any approximations.
Our approach enhances multi-messenger observations by providing (i) accurate localization even before the merger; (ii) improved localization precision by $sim30%$ compared to approximate low-latency methods; and (iii) detailed information on luminosity distance, inclination, and masses.
arXiv Detail & Related papers (2024-07-12T18:00:02Z) - DMSSN: Distilled Mixed Spectral-Spatial Network for Hyperspectral Salient Object Detection [12.823338405434244]
Hyperspectral salient object detection (HSOD) has exhibited remarkable promise across various applications.
Previous methods insufficiently harness the inherent distinctive attributes of hyperspectral images (HSIs) during the feature extraction process.
We propose Distilled Mixed Spectral-Spatial Network (DMSSN), comprising a Distilled Spectral-Spatial Transformer (MSST)
We have created a large-scale HSOD dataset, HSOD-BIT, to tackle the issue of data scarcity in this field.
arXiv Detail & Related papers (2024-03-31T14:04:57Z) - Compression of Structured Data with Autoencoders: Provable Benefit of
Nonlinearities and Depth [83.15263499262824]
We prove that gradient descent converges to a solution that completely disregards the sparse structure of the input.
We show how to improve upon Gaussian performance for the compression of sparse data by adding a denoising function to a shallow architecture.
We validate our findings on image datasets, such as CIFAR-10 and MNIST.
arXiv Detail & Related papers (2024-02-07T16:32:29Z) - Imbalanced Aircraft Data Anomaly Detection [103.01418862972564]
Anomaly detection in temporal data from sensors under aviation scenarios is a practical but challenging task.
We propose a Graphical Temporal Data Analysis framework.
It consists three modules, named Series-to-Image (S2I), Cluster-based Resampling Approach using Euclidean Distance (CRD) and Variance-Based Loss (VBL)
arXiv Detail & Related papers (2023-05-17T09:37:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.