Emission-GPT: A domain-specific language model agent for knowledge retrieval, emission inventory and data analysis
- URL: http://arxiv.org/abs/2510.02359v1
- Date: Sun, 28 Sep 2025 07:50:05 GMT
- Title: Emission-GPT: A domain-specific language model agent for knowledge retrieval, emission inventory and data analysis
- Authors: Jiashu Ye, Tong Wu, Weiwen Chen, Hao Zhang, Zeteng Lin, Xingxing Li, Shujuan Weng, Manni Zhu, Xin Yuan, Xinlong Hong, Jingjie Li, Junyu Zheng, Zhijiong Huang, Jing Tang,
- Abstract summary: Emission-GPT is a knowledge-enhanced large language model agent tailored for the atmospheric emissions domain.<n>Built on a curated knowledge base of over 10,000 documents (including standards, reports, guidebooks, and peer-reviewed literature), Emission-GPT integrates prompt engineering and question completion to support accurate domain-specific question answering.<n> Emission-GPT enables users to interactively analyze emissions data via natural language, such as querying and visualizing, analyzing source contributions, and recommending emission factors for user-defined scenarios.
- Score: 17.30731286907039
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Improving air quality and addressing climate change relies on accurate understanding and analysis of air pollutant and greenhouse gas emissions. However, emission-related knowledge is often fragmented and highly specialized, while existing methods for accessing and compiling emissions data remain inefficient. These issues hinder the ability of non-experts to interpret emissions information, posing challenges to research and management. To address this, we present Emission-GPT, a knowledge-enhanced large language model agent tailored for the atmospheric emissions domain. Built on a curated knowledge base of over 10,000 documents (including standards, reports, guidebooks, and peer-reviewed literature), Emission-GPT integrates prompt engineering and question completion to support accurate domain-specific question answering. Emission-GPT also enables users to interactively analyze emissions data via natural language, such as querying and visualizing inventories, analyzing source contributions, and recommending emission factors for user-defined scenarios. A case study in Guangdong Province demonstrates that Emission-GPT can extract key insights--such as point source distributions and sectoral trends--directly from raw data with simple prompts. Its modular and extensible architecture facilitates automation of traditionally manual workflows, positioning Emission-GPT as a foundational tool for next-generation emission inventory development and scenario-based assessment.
Related papers
- The Climate Change Knowledge Graph: Supporting Climate Services [33.331299436929946]
The Climate Change Knowledge Graph is designed to integrate diverse data sources related to climate simulations into a coherent knowledge graph.<n>This innovative resource allows for executing complex queries involving climate models, simulations, variables, configurations,temporal domains, and granularities.
arXiv Detail & Related papers (2026-02-23T12:42:05Z) - Use What You Know: Causal Foundation Models with Partial Graphs [97.91863420927866]
Recently proposed Causal Foundation Models (CFMs) promise a more unified approach by amortising causal discovery and inference in a single step.<n>We bridge this gap by introducing methods to condition CFMs on causal information, such as the causal graph or more readily available ancestral information.
arXiv Detail & Related papers (2026-02-16T17:56:37Z) - EWE: An Agentic Framework for Extreme Weather Analysis [61.092871317626496]
Extreme Weather Expert (EWE) is first intelligent agent framework dedicated to this task.<n>EWE emulates expert visualizations through knowledge-guided planning, closed-loop reasoning, and a domain-tailored meteorological toolkit.<n>To catalyze progress, we introduce the first benchmark for this emerging field, comprising a curated dataset of 103 high-impact events.
arXiv Detail & Related papers (2025-11-26T14:37:25Z) - Closing Gaps in Emissions Monitoring with Climate TRACE [1.0107331293412294]
Climate TRACE is an open-access platform delivering global emissions estimates with enhanced detail, coverage, and timeliness.<n>The dataset is the first to provide globally comprehensive emissions estimates for individual sources.
arXiv Detail & Related papers (2025-11-24T16:28:44Z) - GeoGPT-RAG Technical Report [48.23789135946953]
GeoGPT is an open large language model system built to advance research in the geosciences.<n>RAG augments model outputs with relevant information retrieved from an external knowledge source.<n>RAG uses RAG to draw from the GeoGPT Library, a specialized corpus curated for geoscientific content.
arXiv Detail & Related papers (2025-08-18T08:29:22Z) - Leveraging Land Cover Priors for Isoprene Emission Super-Resolution [15.868193361155656]
This study contributes to atmospheric chemistry and climate modeling by providing a cost-effective, data-driven approach to refining BVOC emission maps.<n>The proposed method enhances the usability of satellite-based emissions data, supporting applications in air quality forecasting, climate impact assessments, and environmental studies.
arXiv Detail & Related papers (2025-03-24T13:23:46Z) - Enhancing LLMs for Governance with Human Oversight: Evaluating and Aligning LLMs on Expert Classification of Climate Misinformation for Detecting False or Misleading Claims about Climate Change [0.0]
Climate misinformation is a problem that has the potential to be substantially aggravated by the development of Large Language Models (LLMs)<n>In this study we evaluate the potential for LLMs to be part of the solution for mitigating online dis/misinformation rather than the problem.
arXiv Detail & Related papers (2025-01-23T16:21:15Z) - CarbonChat: Large Language Model-Based Corporate Carbon Emission Analysis and Climate Knowledge Q&A System [4.008184902967172]
This paper proposes CarbonChat: Large Language Model-based corporate carbon emission analysis and climate knowledge Q&A system.<n>A diversified index module construction method is proposed to handle the segmentation of rule-based and long-text documents.<n>14 dimensions are established for carbon emission analysis, enabling report summarization, relevance evaluation, and customized responses.
arXiv Detail & Related papers (2025-01-03T08:45:38Z) - Evaluating Automatic Speech Recognition Systems for Korean Meteorological Experts [48.89527378273811]
This paper explores integrating Automatic Speech Recognition into natural language query systems for Korean meteorologists.<n>We address challenges in developing ASR systems for the Korean weather domain, specifically specialized vocabulary and Korean linguistic intricacies.
arXiv Detail & Related papers (2024-10-24T05:40:07Z) - Machine Learning for Methane Detection and Quantification from Space -- A survey [49.7996292123687]
Methane (CH_4) is a potent anthropogenic greenhouse gas, contributing 86 times more to global warming than Carbon Dioxide (CO_2) over 20 years.
This work expands existing information on operational methane point source detection sensors in the Short-Wave Infrared (SWIR) bands.
It reviews the state-of-the-art for traditional as well as Machine Learning (ML) approaches.
arXiv Detail & Related papers (2024-08-27T15:03:20Z) - Emissions Reporting Maturity Model: supporting cities to leverage
emissions-related processes through performance indicators and artificial
intelligence [0.0]
This work proposes an Emissions Reporting Maturity Model (ERMM) for examining, clustering, and analysing data from emissions reporting initiatives.
The PIDP supports the preparation of the data from emissions-related databases, the classification of the data according to similarities highlighted by different clustering techniques, and the identification of performance indicator candidates.
arXiv Detail & Related papers (2023-12-08T17:51:57Z) - Counting Carbon: A Survey of Factors Influencing the Emissions of
Machine Learning [77.62876532784759]
Machine learning (ML) requires using energy to carry out computations during the model training process.
The generation of this energy comes with an environmental cost in terms of greenhouse gas emissions, depending on quantity used and the energy source.
We present a survey of the carbon emissions of 95 ML models across time and different tasks in natural language processing and computer vision.
arXiv Detail & Related papers (2023-02-16T18:35:00Z) - Towards Understanding Omission in Dialogue Summarization [45.932368303107104]
Previous works indicated that omission is a major factor in affecting the quality of summarization.
We propose the OLDS dataset, which provides high-quality Omission Labels for Dialogue Summarization.
arXiv Detail & Related papers (2022-11-14T06:56:59Z) - Incorporating Causal Graphical Prior Knowledge into Predictive Modeling
via Simple Data Augmentation [92.96204497841032]
Causal graphs (CGs) are compact representations of the knowledge of the data generating processes behind the data distributions.
We propose a model-agnostic data augmentation method that allows us to exploit the prior knowledge of the conditional independence (CI) relations.
We experimentally show that the proposed method is effective in improving the prediction accuracy, especially in the small-data regime.
arXiv Detail & Related papers (2021-02-27T06:13:59Z) - Analyzing Sustainability Reports Using Natural Language Processing [68.8204255655161]
In recent years, companies have increasingly been aiming to both mitigate their environmental impact and adapt to the changing climate context.
This is reported via increasingly exhaustive reports, which cover many types of climate risks and exposures under the umbrella of Environmental, Social, and Governance (ESG)
We present this tool and the methodology that we used to develop it in the present article.
arXiv Detail & Related papers (2020-11-03T21:22:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.