BCWildfire: A Long-term Multi-factor Dataset and Deep Learning Benchmark for Boreal Wildfire Risk Prediction
- URL: http://arxiv.org/abs/2511.17597v1
- Date: Mon, 17 Nov 2025 22:13:00 GMT
- Title: BCWildfire: A Long-term Multi-factor Dataset and Deep Learning Benchmark for Boreal Wildfire Risk Prediction
- Authors: Zhengsen Xu, Sibo Cheng, Hongjie He, Lanying Wang, Wentao Sun, Jonathan Li, Lincoln Linlin Xu,
- Abstract summary: We present a 25-year, daily-resolution wildfire dataset covering 240 million hectares across British Columbia and surrounding regions.<n>We evaluate a diverse set of time-series forecasting models, including CNN-based, linear-based, Transformer-based, and Mamba-based architectures.
- Score: 12.480140332312695
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Wildfire risk prediction remains a critical yet challenging task due to the complex interactions among fuel conditions, meteorology, topography, and human activity. Despite growing interest in data-driven approaches, publicly available benchmark datasets that support long-term temporal modeling, large-scale spatial coverage, and multimodal drivers remain scarce. To address this gap, we present a 25-year, daily-resolution wildfire dataset covering 240 million hectares across British Columbia and surrounding regions. The dataset includes 38 covariates, encompassing active fire detections, weather variables, fuel conditions, terrain features, and anthropogenic factors. Using this benchmark, we evaluate a diverse set of time-series forecasting models, including CNN-based, linear-based, Transformer-based, and Mamba-based architectures. We also investigate effectiveness of position embedding and the relative importance of different fire-driving factors. The dataset and the corresponding code can be found at https://github.com/SynUW/mmFire
Related papers
- Australian Bushfire Intelligence with AI-Driven Environmental Analytics [2.3974112195086383]
This study examines the capability of predictive environmental data for identifying high-risk bushfire zones across Australia.<n>We integrated historical fire events from NASA-NASA, daily meteorological observations from Meteostat, and vegetation observations from Google Earth Engine.<n>Under a binary framework distinguishing 'low' and 'high' fire risk, the ensemble approach achieved an accuracy of 87%.
arXiv Detail & Related papers (2026-01-03T05:43:12Z) - FireSentry: A Multi-Modal Spatio-temporal Benchmark Dataset for Fine-Grained Wildfire Spread Forecasting [41.82363110982653]
We present FireSentry, a provincial-scale multi-modal wildfire dataset characterized by sub-meter spatial and sub-second temporal resolution.<n>FireSentry provides visible and infrared video streams, in-situ environmental measurements, and manually validated fire masks.<n>Building on FireSentry, we establish a comprehensive benchmark encompassing physics-based, data-driven, and generative models.
arXiv Detail & Related papers (2025-12-03T02:02:47Z) - Zephyrus: An Agentic Framework for Weather Science [47.611521052984365]
Foundation models for weather science are pre-trained on vast amounts of structured numerical data and outperform traditional weather forecasting systems.<n>Large language models (LLMs) excel at understanding and generating text but cannot reason about high-dimensional meteorological datasets.<n>We bridge this gap by building a novel agentic framework for weather science.<n>We design Zephyrus, a multi-turn LLM-based weather agent that iteratively analyzes weather datasets, observes results, and refines its approach through conversational feedback loops.
arXiv Detail & Related papers (2025-10-05T03:34:08Z) - Improving Transferability for Cross-domain Trajectory Prediction via
Neural Stochastic Differential Equation [41.09061877498741]
discrepancies exist among datasets due to external factors and data acquisition strategies.
The proficient performance of models trained on large-scale datasets has limited transferability on other small-size datasets.
We propose a method based on continuous and utilization of Neural Differential Equations (NSDE) for alleviating discrepancies.
The effectiveness of our method is validated against state-of-the-art trajectory prediction models on the popular benchmark datasets: nuScenes, Argoverse, Lyft, INTERACTION, and Open Motion dataset.
arXiv Detail & Related papers (2023-12-26T06:50:29Z) - FLOGA: A machine learning ready dataset, a benchmark and a novel deep
learning model for burnt area mapping with Sentinel-2 [41.28284355136163]
Wildfires pose significant threats to human and animal lives, ecosystems, and socio-economic stability.
In this work, we create and introduce a machine-learning ready dataset we name FLOGA (Forest wiLdfire Observations for the Greek Area)
This dataset is unique as it comprises of satellite imagery acquired before and after a wildfire event.
We use FLOGA to provide a thorough comparison of multiple Machine Learning and Deep Learning algorithms for the automatic extraction of burnt areas.
arXiv Detail & Related papers (2023-11-06T18:42:05Z) - LargeST: A Benchmark Dataset for Large-Scale Traffic Forecasting [65.71129509623587]
Road traffic forecasting plays a critical role in smart city initiatives and has experienced significant advancements thanks to the power of deep learning.
However, the promising results achieved on current public datasets may not be applicable to practical scenarios.
We introduce the LargeST benchmark dataset, which includes a total of 8,600 sensors in California with a 5-year time coverage.
arXiv Detail & Related papers (2023-06-14T05:48:36Z) - Mesogeos: A multi-purpose dataset for data-driven wildfire modeling in
the Mediterranean [5.100085108873068]
Mesogeos is a large-scale dataset for wildfire modeling in the Mediterranean.
It integrates variables representing wildfire drivers (meteorology, vegetation, human activity) and historical records of wildfire ignitions and burned areas.
The datacube structure offers opportunities to assess machine learning (ML) usage in various wildfire modeling tasks.
arXiv Detail & Related papers (2023-06-08T12:11:16Z) - Multi-time Predictions of Wildfire Grid Map using Remote Sensing Local
Data [0.0]
This paper proposes a distributed learning framework that shares local data collected in ten locations in the western USA throughout local agents.
The proposed model has distinct features that address the characteristic need in prediction evaluations, including dynamic online estimation and time-series modeling.
arXiv Detail & Related papers (2022-09-15T22:34:06Z) - Next Day Wildfire Spread: A Machine Learning Data Set to Predict
Wildfire Spreading from Remote-Sensing Data [5.814925201882753]
Next Day Wildfire Spread' is a curated data set of historical wildfires aggregating nearly a decade of remote-sensing data across the United States.
We implement a convolutional autoencoder that takes advantage of the spatial information of this data to predict wildfire spread.
This data set can be used as a benchmark for developing wildfire propagation models based on remote sensing data for a lead time of one day.
arXiv Detail & Related papers (2021-12-04T23:28:44Z) - From Static to Dynamic Prediction: Wildfire Risk Assessment Based on
Multiple Environmental Factors [69.9674326582747]
Wildfire is one of the biggest disasters that frequently occurs on the west coast of the United States.
We propose static and dynamic prediction models to analyze and assess the areas with high wildfire risks in California.
arXiv Detail & Related papers (2021-03-14T17:56:17Z) - Uncertainty Aware Wildfire Management [6.997483623023005]
Recent wildfires in the United States have resulted in loss of life and billions of dollars.
There are limited resources to be deployed over a massive area and the spread of the fire is challenging to predict.
This paper proposes a decision-theoretic approach to combat wildfires.
arXiv Detail & Related papers (2020-10-15T17:47:31Z) - Speak2Label: Using Domain Knowledge for Creating a Large Scale Driver
Gaze Zone Estimation Dataset [55.391532084304494]
Driver Gaze in the Wild dataset contains 586 recordings, captured during different times of the day including evenings.
Driver Gaze in the Wild dataset contains 338 subjects with an age range of 18-63 years.
arXiv Detail & Related papers (2020-04-13T14:47:34Z) - Spatiotemporal Relationship Reasoning for Pedestrian Intent Prediction [57.56466850377598]
Reasoning over visual data is a desirable capability for robotics and vision-based applications.
In this paper, we present a framework on graph to uncover relationships in different objects in the scene for reasoning about pedestrian intent.
Pedestrian intent, defined as the future action of crossing or not-crossing the street, is a very crucial piece of information for autonomous vehicles.
arXiv Detail & Related papers (2020-02-20T18:50:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.