AI Decodes Historical Chinese Archives to Reveal Lost Climate History
- URL: http://arxiv.org/abs/2601.22458v1
- Date: Fri, 30 Jan 2026 02:06:13 GMT
- Title: AI Decodes Historical Chinese Archives to Reveal Lost Climate History
- Authors: Sida He, Lingxi Xie, Xiaopeng Zhang, Qi Tian,
- Abstract summary: We introduce a generative AI framework that inverts the logic of historical chroniclers by inferring the quantitative climate patterns associated with documented events.<n>applied to historical Chinese archives, it produces the sub-annual precipitation reconstruction for southeastern China over the period 1368-1911 AD.<n>Our reconstruction not only quantifies iconic extremes like the Ming Dynasty's Great Drought but also, crucially, maps the full spatial and seasonal structure of El Ni$$o influence on precipitation in this region over five centuries.
- Score: 82.46757587387704
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Historical archives contain qualitative descriptions of climate events, yet converting these into quantitative records has remained a fundamental challenge. Here we introduce a paradigm shift: a generative AI framework that inverts the logic of historical chroniclers by inferring the quantitative climate patterns associated with documented events. Applied to historical Chinese archives, it produces the sub-annual precipitation reconstruction for southeastern China over the period 1368-1911 AD. Our reconstruction not only quantifies iconic extremes like the Ming Dynasty's Great Drought but also, crucially, maps the full spatial and seasonal structure of El Ni$ñ$o influence on precipitation in this region over five centuries, revealing dynamics inaccessible in shorter modern records. Our methodology and high-resolution climate dataset are directly applicable to climate science and have broader implications for the historical and social sciences.
Related papers
- Climate Knowledge in Large Language Models [0.0]
This study investigates the capacity of large language models to recall climate normals without external retrieval.<n>We construct a global grid of queries at 1deg resolution land points, providing coordinates and location descriptors, and validate responses against ERA5 reanalysis.<n>Results show that LLMs encode non-trivial climate structure, capturing latitudinal and topographic patterns, with root-mean-square errors of 3-6 degC and biases of $pm$1 degC.<n>We find that including geographic context reduces errors by 27% on average, with larger models being most sensitive to location descriptors.
arXiv Detail & Related papers (2025-10-09T10:25:36Z) - WeatherArchive-Bench: Benchmarking Retrieval-Augmented Reasoning for Historical Weather Archives [15.620758706846388]
We introduce WeatherArchive-Bench, the first benchmark for evaluating retrieval-augmented generation (RAG) systems on historical weather archives.<n>WeatherArchive-Bench comprises two tasks: WeatherArchive-Retrieval, which measures a system's ability to locate historically relevant passages from over one million archival news segments, and WeatherArchive-Assessment, which evaluates whether Large Language Models can classify societal vulnerability and resilience indicators from extreme weather narratives.
arXiv Detail & Related papers (2025-10-06T19:58:42Z) - ClimateBench-M: A Multi-Modal Climate Data Benchmark with a Simple Generative Method [61.76389719956301]
We contribute a multi-modal climate benchmark, i.e., ClimateBench-M, which aligns time series climate data from ERA5, extreme weather events data from NOAA, and satellite image data from NASA.<n>Under each data modality, we also propose a simple but strong generative method that could produce competitive performance in weather forecasting, thunderstorm alerts, and crop segmentation tasks.
arXiv Detail & Related papers (2025-04-10T02:22:23Z) - MambaDS: Near-Surface Meteorological Field Downscaling with Topography Constrained Selective State Space Modeling [68.69647625472464]
Downscaling, a crucial task in meteorological forecasting, enables the reconstruction of high-resolution meteorological states for target regions.
Previous downscaling methods lacked tailored designs for meteorology and encountered structural limitations.
We propose a novel model called MambaDS, which enhances the utilization of multivariable correlations and topography information.
arXiv Detail & Related papers (2024-08-20T13:45:49Z) - Reconstructing Historical Climate Fields With Deep Learning [0.0]
We employ a recently introduced deep-learning approach based on Fourier convolutions, trained on numerical climate model output, to reconstruct historical climate fields.<n>We are able to realistically reconstruct large and irregular areas of missing data, as well as reconstruct known historical events such as strong El Nino and La Nina with very little given information.
arXiv Detail & Related papers (2023-11-30T08:34:12Z) - Coloring the Past: Neural Historical Buildings Reconstruction from
Archival Photography [69.93897305312574]
We introduce an approach to reconstruct the geometry of historical buildings, employing volumetric rendering techniques.
We leverage dense point clouds as a geometric prior and introduce a color appearance embedding loss to recover the color of the building given limited available color images.
arXiv Detail & Related papers (2023-11-29T16:59:45Z) - Interpretable AI-Driven Discovery of Terrain-Precipitation Relationships
for Enhanced Climate Insights [8.780306158191443]
We propose an AI-driven knowledge discovery framework known as genetic algorithm-geographic weighted regression (GA-GWR)
Our approach seeks to unveil the explicit equations that govern the relationship between precipitation patterns and terrain characteristics in regions marked by complex terrain.
Through this AI-driven knowledge discovery, we uncover previously undisclosed explicit equations that shed light on the connection between terrain features and precipitation patterns.
arXiv Detail & Related papers (2023-09-27T04:47:22Z) - Multi-scale Digital Twin: Developing a fast and physics-informed
surrogate model for groundwater contamination with uncertain climate models [53.44486283038738]
Climate change exacerbates the long-term soil management problem of groundwater contamination.
We develop a physics-informed machine learning surrogate model using U-Net enhanced Fourier Neural Contaminated (PDENO)
In parallel, we develop a convolutional autoencoder combined with climate data to reduce the dimensionality of climatic region similarities across the United States.
arXiv Detail & Related papers (2022-11-20T06:46:35Z) - Spatiotemporal modeling of European paleoclimate using doubly sparse
Gaussian processes [61.31361524229248]
We build on recent scale sparsetemporal GPs to reduce the computational burden.
We successfully employ such a doubly sparse GP to construct a probabilistic model of paleoclimate.
arXiv Detail & Related papers (2022-11-15T14:15:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.