OKG-LLM: Aligning Ocean Knowledge Graph with Observation Data via LLMs for Global Sea Surface Temperature Prediction
- URL: http://arxiv.org/abs/2508.00933v1
- Date: Thu, 31 Jul 2025 02:06:03 GMT
- Title: OKG-LLM: Aligning Ocean Knowledge Graph with Observation Data via LLMs for Global Sea Surface Temperature Prediction
- Authors: Hanchen Yang, Jiaqi Wang, Jiannong Cao, Wengen Li, Jialun Zheng, Yangning Li, Chunyu Miao, Jihong Guan, Shuigeng Zhou, Philip S. Yu,
- Abstract summary: This work presents the first systematic effort to construct an Ocean Knowledge Graph (OKG) specifically designed to represent diverse ocean knowledge for SST prediction.<n>We develop a graph embedding network to learn the comprehensive semantic and structural knowledge within the OKG, capturing both the unique characteristics of individual sea regions and the complex correlations between them. Finally, we align the learned knowledge with fine-grained numerical SST data and leverage a pre-trained LLM to model SST patterns for accurate prediction.
- Score: 70.48962924608033
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Sea surface temperature (SST) prediction is a critical task in ocean science, supporting various applications, such as weather forecasting, fisheries management, and storm tracking. While existing data-driven methods have demonstrated significant success, they often neglect to leverage the rich domain knowledge accumulated over the past decades, limiting further advancements in prediction accuracy. The recent emergence of large language models (LLMs) has highlighted the potential of integrating domain knowledge for downstream tasks. However, the application of LLMs to SST prediction remains underexplored, primarily due to the challenge of integrating ocean domain knowledge and numerical data. To address this issue, we propose Ocean Knowledge Graph-enhanced LLM (OKG-LLM), a novel framework for global SST prediction. To the best of our knowledge, this work presents the first systematic effort to construct an Ocean Knowledge Graph (OKG) specifically designed to represent diverse ocean knowledge for SST prediction. We then develop a graph embedding network to learn the comprehensive semantic and structural knowledge within the OKG, capturing both the unique characteristics of individual sea regions and the complex correlations between them. Finally, we align and fuse the learned knowledge with fine-grained numerical SST data and leverage a pre-trained LLM to model SST patterns for accurate prediction. Extensive experiments on the real-world dataset demonstrate that OKG-LLM consistently outperforms state-of-the-art methods, showcasing its effectiveness, robustness, and potential to advance SST prediction. The codes are available in the online repository.
Related papers
- Deep Learning Weather Models for Subregional Ocean Forecasting: A Case Study on the Canary Current Upwelling System [0.0]
This work aims to adapt a graph neural network initially developed for global weather forecasting to improve subregional ocean prediction.<n>The model is trained with satellite data and compared to state-of-the-art physical ocean models to assess its performance in capturing ocean dynamics.<n>Our results show that the deep learning model surpasses traditional methods in precision despite some challenges in upwelling areas.
arXiv Detail & Related papers (2025-05-30T10:10:40Z) - Efficient Self-Supervised Learning for Earth Observation via Dynamic Dataset Curation [67.23953699167274]
Self-supervised learning (SSL) has enabled the development of vision foundation models for Earth Observation (EO)<n>In EO, this challenge is amplified by the redundancy and heavy-tailed distributions common in satellite imagery.<n>We propose a dynamic dataset pruning strategy designed to improve SSL pre-training by maximizing dataset diversity and balance.
arXiv Detail & Related papers (2025-04-09T15:13:26Z) - Conservation-informed Graph Learning for Spatiotemporal Dynamics Prediction [84.26340606752763]
In this paper, we introduce the conservation-informed GNN (CiGNN), an end-to-end explainable learning framework.<n>The network is designed to conform to the general symmetry conservation law via symmetry where conservative and non-conservative information passes over a multiscale space by a latent temporal marching strategy.<n>Results demonstrate that CiGNN exhibits remarkable baseline accuracy and generalizability, and is readily applicable to learning for prediction of varioustemporal dynamics.
arXiv Detail & Related papers (2024-12-30T13:55:59Z) - Deep Learning for Sea Surface Temperature Reconstruction under Cloud Occlusion [34.00878406145686]
We describe several Machine Learning models to fill the cloud-occluded areas starting from MODIS Aqua nighttime L3 images.<n>To tackle this challenge, we employed a type of Convolutional Neural Network model (U-net) to reconstruct cloud-covered portions of satellite imagery.<n>Our best-performing architecture show 50% lower root mean square errors over established gap-filling methods.
arXiv Detail & Related papers (2024-12-04T15:49:49Z) - Towards an end-to-end artificial intelligence driven global weather forecasting system [57.5191940978886]
We present an AI-based data assimilation model, i.e., Adas, for global weather variables.
We demonstrate that Adas can assimilate global observations to produce high-quality analysis, enabling the system operate stably for long term.
We are the first to apply the methods to real-world scenarios, which is more challenging and has considerable practical application potential.
arXiv Detail & Related papers (2023-12-18T09:05:28Z) - GeoLLM: Extracting Geospatial Knowledge from Large Language Models [49.20315582673223]
We present GeoLLM, a novel method that can effectively extract geospatial knowledge from large language models.
We demonstrate the utility of our approach across multiple tasks of central interest to the international community, including the measurement of population density and economic livelihoods.
Our experiments reveal that LLMs are remarkably sample-efficient, rich in geospatial information, and robust across the globe.
arXiv Detail & Related papers (2023-10-10T00:03:23Z) - Multi-decadal Sea Level Prediction using Neural Networks and Spectral
Clustering on Climate Model Large Ensembles and Satellite Altimeter Data [0.0]
We show the potential of machine learning (ML) in this challenging application of long-term sea level forecasting.
We develop a supervised learning framework using fully connected neural networks (FCNNs) that can predict the sea level trend.
We also show the effectiveness of partitioning our spatial dataset and learning a dedicated ML model for each segmented region.
arXiv Detail & Related papers (2023-10-06T19:06:43Z) - Learning to Predict Navigational Patterns from Partial Observations [63.04492958425066]
This paper presents the first self-supervised learning (SSL) method for learning to infer navigational patterns in real-world environments from partial observations only.
We demonstrate how to infer global navigational patterns by fitting a maximum likelihood graph to the DSLP field.
Experiments show that our SSL model outperforms two SOTA supervised lane graph prediction models on the nuScenes dataset.
arXiv Detail & Related papers (2023-04-26T02:08:46Z) - Physical Knowledge Enhanced Deep Neural Network for Sea Surface
Temperature Prediction [29.989387641655625]
We introduce a method for Sea Surface Temperature (SST) prediction that transfers physical knowledge from historical observations to numerical models.
Specifically, we use a combination of an encoder and a generative adversarial network (GAN) to capture physical knowledge from the observed data.
The numerical model data is then fed into the pre-trained model to generate physics-enhanced data, which can then be used for SST prediction.
arXiv Detail & Related papers (2023-04-19T02:08:54Z) - Towards Spatio-temporal Sea Surface Temperature Forecasting via Static
and Dynamic Learnable Personalized Graph Convolution Network [9.189893653029076]
This paper proposes a novel static and dynamic learnable personalized graph convolution network (SD-LPGC)
Specifically, two graph learning layers are first constructed to respectively model the stable long-term and short-term evolutionary patterns hidden in the SST signals.
Then, a learnable personalized convolution layer is designed to fuse this information.
arXiv Detail & Related papers (2023-04-12T14:35:38Z) - Predicting Critical Biogeochemistry of the Southern Ocean for Climate
Monitoring [1.8689461238197955]
We train neural networks to predict silicate and phosphate values in the Southern Ocean from temperature, pressure, salinity, oxygen, nitrate, and location.
We apply these models to earth system model (ESM) and BGC-Argo data to expand the utility of this ocean observation network.
arXiv Detail & Related papers (2021-10-30T00:13:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.