BATIS: Bayesian Approaches for Targeted Improvement of Species Distribution Models
- URL: http://arxiv.org/abs/2510.19749v1
- Date: Wed, 22 Oct 2025 16:42:46 GMT
- Title: BATIS: Bayesian Approaches for Targeted Improvement of Species Distribution Models
- Authors: Catherine Villeneuve, Benjamin Akera, Mélisande Teng, David Rolnick,
- Abstract summary: Species distribution models (SDMs) aim to predict species occurrence based on environmental variables.<n>Recent deep learning advances for SDMs have been shown to perform well on complex and heterogeneous datasets.<n>We introduce BATIS, a novel and practical framework wherein prior predictions are updated iteratively using limited observational data.
- Score: 15.029163153558533
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Species distribution models (SDMs), which aim to predict species occurrence based on environmental variables, are widely used to monitor and respond to biodiversity change. Recent deep learning advances for SDMs have been shown to perform well on complex and heterogeneous datasets, but their effectiveness remains limited by spatial biases in the data. In this paper, we revisit deep SDMs from a Bayesian perspective and introduce BATIS, a novel and practical framework wherein prior predictions are updated iteratively using limited observational data. Models must appropriately capture both aleatoric and epistemic uncertainty to effectively combine fine-grained local insights with broader ecological patterns. We benchmark an extensive set of uncertainty quantification approaches on a novel dataset including citizen science observations from the eBird platform. Our empirical study shows how Bayesian deep learning approaches can greatly improve the reliability of SDMs in data-scarce locations, which can contribute to ecological understanding and conservation efforts.
Related papers
- Investigating the Impact of Histopathological Foundation Models on Regressive Prediction of Homologous Recombination Deficiency [52.50039435394964]
We systematically evaluate foundation models for regression-based tasks.<n>We extract patch-level features from whole slide images (WSI) using five state-of-the-art foundation models.<n>Models are trained to predict continuous HRD scores based on these extracted features across breast, endometrial, and lung cancer cohorts.
arXiv Detail & Related papers (2026-01-29T14:06:50Z) - FrogDeepSDM: Improving Frog Counting and Occurrence Prediction Using Multimodal Data and Pseudo-Absence Imputation [0.9537146822132906]
Species Distribution Modelling (SDM) helps predict species presence across large regions.<n>In this study, we enhance SDM accuracy for frogs (Anura) by applying deep learning and data imputation techniques.<n>Experiments show that data balancing significantly improved model performance, reducing the Mean Absolute Error (MAE) from 189 to 29 in frog counting tasks.
arXiv Detail & Related papers (2025-10-22T07:09:36Z) - Active Target Discovery under Uninformative Prior: The Power of Permanent and Transient Memory [26.488250231429774]
In many scientific and engineering fields, where acquiring high-quality data is expensive, strategic sampling of unobserved regions is crucial for maximizing discovery rates within a constrained budget.<n>We propose a novel approach that enables effective active target discovery even in settings with uninformative priors.<n>Unlike black-box policies, our approach is inherently interpretable, providing clear insights into decision-making.
arXiv Detail & Related papers (2025-10-19T00:42:56Z) - Multi-scale species richness estimation with deep learning [0.0]
We combine sampling theory and deep learning to predict local species richness within arbitrarily large sampling areas.<n>We show how our deep SAR model can provide fundamental insights on the multi-scale effects of key biodiversity processes.
arXiv Detail & Related papers (2025-07-08T19:42:33Z) - Predicting butterfly species presence from satellite imagery using soft contrastive regularisation [1.0923877073891446]
This paper presents a new data set for predicting butterfly species presence from satellite data in the United Kingdom.<n>We experimentally optimise a Resnet-based model to predict multi-species presence from 4-band satellite images.<n>We develop a soft, supervised contrastive regularisation loss that is tailored to probabilistic labels.
arXiv Detail & Related papers (2025-05-14T11:42:09Z) - MaskSDM with Shapley values to improve flexibility, robustness, and explainability in species distribution modeling [3.428447509258587]
Species Distribution Models (SDMs) play a vital role in biodiversity research, conservation planning, and ecological niche modeling.<n>We introduce MaskSDM, a novel deep learning-based SDM that enables flexible predictor selection by employing a masked training strategy.<n>We evaluate MaskSDM on the global sPlotOpen dataset, modeling the distributions of 12,738 plant species.
arXiv Detail & Related papers (2025-03-17T11:02:28Z) - Combining Observational Data and Language for Species Range Estimation [63.65684199946094]
We propose a novel approach combining millions of citizen science species observations with textual descriptions from Wikipedia.<n>Our framework maps locations, species, and text descriptions into a common space, enabling zero-shot range estimation from textual descriptions.<n>Our approach also acts as a strong prior when combined with observational data, resulting in more accurate range estimation with less data.
arXiv Detail & Related papers (2024-10-14T17:22:55Z) - Seeing Unseen: Discover Novel Biomedical Concepts via
Geometry-Constrained Probabilistic Modeling [53.7117640028211]
We present a geometry-constrained probabilistic modeling treatment to resolve the identified issues.
We incorporate a suite of critical geometric properties to impose proper constraints on the layout of constructed embedding space.
A spectral graph-theoretic method is devised to estimate the number of potential novel classes.
arXiv Detail & Related papers (2024-03-02T00:56:05Z) - SatBird: Bird Species Distribution Modeling with Remote Sensing and
Citizen Science Data [68.2366021016172]
We present SatBird, a satellite dataset of locations in the USA with labels derived from presence-absence observation data from the citizen science database eBird.
We also provide a dataset in Kenya representing low-data regimes.
We benchmark a set of baselines on our dataset, including SOTA models for remote sensing tasks.
arXiv Detail & Related papers (2023-11-02T02:00:27Z) - Unlabelled Data Improves Bayesian Uncertainty Calibration under
Covariate Shift [100.52588638477862]
We develop an approximate Bayesian inference scheme based on posterior regularisation.
We demonstrate the utility of our method in the context of transferring prognostic models of prostate cancer across globally diverse populations.
arXiv Detail & Related papers (2020-06-26T13:50:19Z) - Diversity inducing Information Bottleneck in Model Ensembles [73.80615604822435]
In this paper, we target the problem of generating effective ensembles of neural networks by encouraging diversity in prediction.
We explicitly optimize a diversity inducing adversarial loss for learning latent variables and thereby obtain diversity in the output predictions necessary for modeling multi-modal data.
Compared to the most competitive baselines, we show significant improvements in classification accuracy, under a shift in the data distribution.
arXiv Detail & Related papers (2020-03-10T03:10:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.