SatelliteFormula: Multi-Modal Symbolic Regression from Remote Sensing Imagery for Physics Discovery
- URL: http://arxiv.org/abs/2506.06176v1
- Date: Fri, 06 Jun 2025 15:39:54 GMT
- Title: SatelliteFormula: Multi-Modal Symbolic Regression from Remote Sensing Imagery for Physics Discovery
- Authors: Zhenyu Yu, Mohd. Yamani Idna Idris, Pei Wang, Yuelong Xia, Fei Ma, Rizwan Qureshi,
- Abstract summary: We propose a novel symbolic regression framework that derives physically interpretable expressions directly from remote sensing imagery.<n>SatelliteFormula combines a Vision Transformer-based encoder for spatial-spectral feature extraction with physics-guided constraints to ensure consistency and interpretability.
- Score: 8.965479246496878
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We propose SatelliteFormula, a novel symbolic regression framework that derives physically interpretable expressions directly from multi-spectral remote sensing imagery. Unlike traditional empirical indices or black-box learning models, SatelliteFormula combines a Vision Transformer-based encoder for spatial-spectral feature extraction with physics-guided constraints to ensure consistency and interpretability. Existing symbolic regression methods struggle with the high-dimensional complexity of multi-spectral data; our method addresses this by integrating transformer representations into a symbolic optimizer that balances accuracy and physical plausibility. Extensive experiments on benchmark datasets and remote sensing tasks demonstrate superior performance, stability, and generalization compared to state-of-the-art baselines. SatelliteFormula enables interpretable modeling of complex environmental variables, bridging the gap between data-driven learning and physical understanding.
Related papers
- Progressive Inertial Poser: Progressive Real-Time Kinematic Chain Estimation for 3D Full-Body Pose from Three IMU Sensors [25.67875816218477]
Full-body pose estimation from sparse tracking signals is not limited by environmental conditions or recording range.<n>Previous works either face the challenge of wearing additional sensors on the pelvis and lower-body or rely on external visual sensors to obtain global positions of key joints.<n>To improve the practicality of the technology for virtual reality applications, we estimate full-body poses using only inertial data obtained from three Inertial Measurement Unit (IMU) sensors worn on the head and wrists.
arXiv Detail & Related papers (2025-05-08T15:28:09Z) - SatelliteCalculator: A Multi-Task Vision Foundation Model for Quantitative Remote Sensing Inversion [4.824120664293887]
We introduce SatelliteCalculator, the first vision foundation model for quantitative remote sensing inversion.<n>By leveraging physically defined index adapters, we automatically construct a large-scale dataset of over one million paired samples.<n> Experiments demonstrate that SatelliteCalculator achieves competitive accuracy across all tasks while significantly reducing inference cost.
arXiv Detail & Related papers (2025-04-18T03:48:04Z) - UrbanSAM: Learning Invariance-Inspired Adapters for Segment Anything Models in Urban Construction [51.54946346023673]
Urban morphology is inherently complex, with irregular objects of diverse shapes and varying scales.<n>The Segment Anything Model (SAM) has shown significant potential in segmenting complex scenes.<n>We propose UrbanSAM, a customized version of SAM specifically designed to analyze complex urban environments.
arXiv Detail & Related papers (2025-02-21T04:25:19Z) - Gaussian Splatting to Real World Flight Navigation Transfer with Liquid Networks [93.38375271826202]
We present a method to improve generalization and robustness to distribution shifts in sim-to-real visual quadrotor navigation tasks.
We first build a simulator by integrating Gaussian splatting with quadrotor flight dynamics, and then, train robust navigation policies using Liquid neural networks.
In this way, we obtain a full-stack imitation learning protocol that combines advances in 3D Gaussian splatting radiance field rendering, programming of expert demonstration training data, and the task understanding capabilities of Liquid networks.
arXiv Detail & Related papers (2024-06-21T13:48:37Z) - MetaGraspNet: A Large-Scale Benchmark Dataset for Scene-Aware
Ambidextrous Bin Picking via Physics-based Metaverse Synthesis [72.85526892440251]
We introduce MetaGraspNet, a large-scale photo-realistic bin picking dataset constructed via physics-based metaverse synthesis.
The proposed dataset contains 217k RGBD images across 82 different article types, with full annotations for object detection, amodal perception, keypoint detection, manipulation order and ambidextrous grasp labels for a parallel-jaw and vacuum gripper.
We also provide a real dataset consisting of over 2.3k fully annotated high-quality RGBD images, divided into 5 levels of difficulties and an unseen object set to evaluate different object and layout properties.
arXiv Detail & Related papers (2022-08-08T08:15:34Z) - Unsupervised Discovery of Semantic Concepts in Satellite Imagery with
Style-based Wavelet-driven Generative Models [27.62417543307831]
We present the first pre-trained style- and wavelet-based GAN model that can synthesize a wide gamut of realistic satellite images.
We show that by analyzing the intermediate activations of our network, one can discover a multitude of interpretable semantic directions.
arXiv Detail & Related papers (2022-08-03T14:19:24Z) - Dynamic Spatial Sparsification for Efficient Vision Transformers and
Convolutional Neural Networks [88.77951448313486]
We present a new approach for model acceleration by exploiting spatial sparsity in visual data.
We propose a dynamic token sparsification framework to prune redundant tokens.
We extend our method to hierarchical models including CNNs and hierarchical vision Transformers.
arXiv Detail & Related papers (2022-07-04T17:00:51Z) - A Trainable Spectral-Spatial Sparse Coding Model for Hyperspectral Image
Restoration [36.525810477650026]
Hyperspectral imaging offers new perspectives for diverse applications.
The lack of accurate ground-truth "clean" hyperspectral signals on the spot makes restoration tasks challenging.
In this paper, we advocate for a hybrid approach based on sparse coding principles.
arXiv Detail & Related papers (2021-11-18T14:16:04Z) - Point Cloud Based Reinforcement Learning for Sim-to-Real and Partial
Observability in Visual Navigation [62.22058066456076]
Reinforcement Learning (RL) represents powerful tools to solve complex robotic tasks.
RL does not work directly in the real-world, which is known as the sim-to-real transfer problem.
We propose a method that learns on an observation space constructed by point clouds and environment randomization.
arXiv Detail & Related papers (2020-07-27T17:46:59Z) - Benchmarking Unsupervised Object Representations for Video Sequences [111.81492107649889]
We compare the perceptual abilities of four object-centric approaches: ViMON, OP3, TBA and SCALOR.
Our results suggest that the architectures with unconstrained latent representations learn more powerful representations in terms of object detection, segmentation and tracking.
Our benchmark may provide fruitful guidance towards learning more robust object-centric video representations.
arXiv Detail & Related papers (2020-06-12T09:37:24Z) - Learning the sense of touch in simulation: a sim-to-real strategy for
vision-based tactile sensing [1.9981375888949469]
This paper focuses on a vision-based tactile sensor, which aims to reconstruct the distribution of the three-dimensional contact forces applied on its soft surface.
A strategy is proposed to train a tailored deep neural network entirely from the simulation data.
The resulting learning architecture is directly transferable across multiple tactile sensors without further training and yields accurate predictions on real data.
arXiv Detail & Related papers (2020-03-05T14:17:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.