Gaussianizing the Earth: Multidimensional Information Measures for Earth
Data Analysis
- URL: http://arxiv.org/abs/2010.06476v2
- Date: Wed, 25 Nov 2020 10:10:44 GMT
- Title: Gaussianizing the Earth: Multidimensional Information Measures for Earth
Data Analysis
- Authors: J. Emmanuel Johnson, Valero Laparra, Maria Piles, Gustau Camps-Valls
- Abstract summary: Information theory is an excellent framework for analyzing Earth system data.
It allows us to characterize uncertainty and redundancy, and is universally interpretable.
We show how information theory measures can be applied in various Earth system data analysis problems.
- Score: 9.464720193746395
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Information theory is an excellent framework for analyzing Earth system data
because it allows us to characterize uncertainty and redundancy, and is
universally interpretable. However, accurately estimating information content
is challenging because spatio-temporal data is high-dimensional, heterogeneous
and has non-linear characteristics. In this paper, we apply multivariate
Gaussianization for probability density estimation which is robust to
dimensionality, comes with statistical guarantees, and is easy to apply. In
addition, this methodology allows us to estimate information-theoretic measures
to characterize multivariate densities: information, entropy, total
correlation, and mutual information. We demonstrate how information theory
measures can be applied in various Earth system data analysis problems. First
we show how the method can be used to jointly Gaussianize radar backscattering
intensities, synthesize hyperspectral data, and quantify of information content
in aerial optical images. We also quantify the information content of several
variables describing the soil-vegetation status in agro-ecosystems, and
investigate the temporal scales that maximize their shared information under
extreme events such as droughts. Finally, we measure the relative information
content of space and time dimensions in remote sensing products and model
simulations involving long records of key variables such as precipitation,
sensible heat and evaporation. Results confirm the validity of the method, for
which we anticipate a wide use and adoption. Code and demos of the implemented
algorithms and information-theory measures are provided.
Related papers
- Density Estimation via Binless Multidimensional Integration [45.21975243399607]
We introduce the Binless Multidimensional Thermodynamic Integration (BMTI) method for nonparametric, robust, and data-efficient density estimation.
BMTI estimates the logarithm of the density by initially computing log-density differences between neighbouring data points.
The method is tested on a variety of complex synthetic high-dimensional datasets, and is benchmarked on realistic datasets from the chemical physics literature.
arXiv Detail & Related papers (2024-07-10T23:45:20Z) - Adversarial Estimation of Topological Dimension with Harmonic Score Maps [7.34158170612151]
We show that it is possible to retrieve the topological dimension of the manifold learned by the score map.
We then introduce a novel method to measure the learned manifold's topological dimension using adversarial attacks.
arXiv Detail & Related papers (2023-12-11T22:29:54Z) - Data-Efficient Learning via Minimizing Hyperspherical Energy [48.47217827782576]
This paper considers the problem of data-efficient learning from scratch using a small amount of representative data.
We propose a MHE-based active learning (MHEAL) algorithm, and provide comprehensive theoretical guarantees for MHEAL.
arXiv Detail & Related papers (2022-06-30T11:39:12Z) - Information-Theoretic Odometry Learning [83.36195426897768]
We propose a unified information theoretic framework for learning-motivated methods aimed at odometry estimation.
The proposed framework provides an elegant tool for performance evaluation and understanding in information-theoretic language.
arXiv Detail & Related papers (2022-03-11T02:37:35Z) - Featurized Density Ratio Estimation [82.40706152910292]
In our work, we propose to leverage an invertible generative model to map the two distributions into a common feature space prior to estimation.
This featurization brings the densities closer together in latent space, sidestepping pathological scenarios where the learned density ratios in input space can be arbitrarily inaccurate.
At the same time, the invertibility of our feature map guarantees that the ratios computed in feature space are equivalent to those in input space.
arXiv Detail & Related papers (2021-07-05T18:30:26Z) - Ranking the information content of distance measures [61.754016309475745]
We introduce a statistical test that can assess the relative information retained when using two different distance measures.
This in turn allows finding the most informative distance measure out of a pool of candidates.
arXiv Detail & Related papers (2021-04-30T15:57:57Z) - Machine Learning Information Fusion in Earth Observation: A
Comprehensive Review of Methods, Applications and Data Sources [0.0]
This paper reviews the most important information fusion algorithms based on Machine Learning (ML) techniques for problems in Earth observation.
Data-driven approaches, and ML techniques in particular, are the natural choice to extract significant information from this data deluge.
arXiv Detail & Related papers (2020-12-07T13:35:08Z) - Information Theory Measures via Multidimensional Gaussianization [7.788961560607993]
Information theory is an outstanding framework to measure uncertainty, dependence and relevance in data and systems.
It has several desirable properties for real world applications.
However, obtaining information from multidimensional data is a challenging problem due to the curse of dimensionality.
arXiv Detail & Related papers (2020-10-08T07:22:16Z) - Graph Embedding with Data Uncertainty [113.39838145450007]
spectral-based subspace learning is a common data preprocessing step in many machine learning pipelines.
Most subspace learning methods do not take into consideration possible measurement inaccuracies or artifacts that can lead to data with high uncertainty.
arXiv Detail & Related papers (2020-09-01T15:08:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.