HVR-Met: A Hypothesis-Verification-Replaning Agentic System for Extreme Weather Diagnosis
- URL: http://arxiv.org/abs/2603.01121v1
- Date: Sun, 01 Mar 2026 14:13:11 GMT
- Title: HVR-Met: A Hypothesis-Verification-Replaning Agentic System for Extreme Weather Diagnosis
- Authors: Shuo Tang, Jiadong Zhang, Jian Xu, Gengxian Zhou, Qizhao Jin, Qinxuan Wang, Yi Hu, Ning Hu, Hongchang Ren, Lingli He, Jiaolan Fu, Jingtao Ding, Shiming Xiang, Chenglin Liu,
- Abstract summary: HVR-Met is a meteorological diagnostic system characterized by the deep integration of expert knowledge.<n>Its central innovation is the Hypothesis-Verification-Replanning'' closed-loop mechanism.<n>We introduce a novel benchmark focused on atomic-level subtasks.
- Score: 45.18017161777437
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: While deep learning-based weather forecasting paradigms have made significant strides, addressing extreme weather diagnostics remains a formidable challenge. This gap exists primarily because the diagnostic process demands sophisticated multi-step logical reasoning, dynamic tool invocation, and expert-level prior judgment. Although agents possess inherent advantages in task decomposition and autonomous execution, current architectures are still hampered by critical bottlenecks: inadequate expert knowledge integration, a lack of professional-grade iterative reasoning loops, and the absence of fine-grained validation and evaluation systems for complex workflows under extreme conditions. To this end, we propose HVR-Met, a multi-agent meteorological diagnostic system characterized by the deep integration of expert knowledge. Its central innovation is the ``Hypothesis-Verification-Replanning'' closed-loop mechanism, which facilitates sophisticated iterative reasoning for anomalous meteorological signals during extreme weather events. To bridge gaps within existing evaluation frameworks, we further introduce a novel benchmark focused on atomic-level subtasks. Experimental evidence demonstrates that the system excels in complex diagnostic scenarios.
Related papers
- AnomaMind: Agentic Time Series Anomaly Detection with Tool-Augmented Reasoning [24.317775311623922]
AnomaMind is a time series anomaly detection framework that reformulates anomaly detection as a sequential decision-making process.<n>AnomaMind operates through a structured workflow that localizes anomalous intervals in a coarse-to-fine manner.<n>A key design of AnomaMind is an explicitly designed hybrid inference mechanism for tool-augmented anomaly detection.
arXiv Detail & Related papers (2026-02-14T14:35:34Z) - Agentic Spatio-Temporal Grounding via Collaborative Reasoning [80.83158605034465]
Temporal Video Grounding aims to retrieve thetemporal tube of a target object or person in a video given a text query.<n>We propose the Agentic Spatio-Temporal Grounder (ASTG) framework for the task of STVG towards an open-world and training-free scenario.<n>Specifically, two specialized agents SRA (Spatial Reasoning Agent) and TRA (Temporal Reasoning Agent) constructed leveraging on modern Multimoal Large Language Models (MLLMs)<n>Experiments on popular benchmarks demonstrate the superiority of the proposed approach where it outperforms existing weakly-supervised and zero-shot approaches by a margin
arXiv Detail & Related papers (2026-02-10T10:16:27Z) - AstroReason-Bench: Evaluating Unified Agentic Planning across Heterogeneous Space Planning Problems [71.89040853616602]
We introduce AstroReason-Bench, a benchmark for evaluating agentic planning in Space Planning Problems (SPP)<n>AstroReason-Bench integrates multiple scheduling regimes, including ground station communication and agile Earth observation, and provides a unified agent-oriented interaction protocol.<n>We find that current agents substantially underperform specialized solvers, highlighting key limitations of generalist planning under realistic constraints.
arXiv Detail & Related papers (2026-01-16T15:02:41Z) - EWE: An Agentic Framework for Extreme Weather Analysis [61.092871317626496]
Extreme Weather Expert (EWE) is first intelligent agent framework dedicated to this task.<n>EWE emulates expert visualizations through knowledge-guided planning, closed-loop reasoning, and a domain-tailored meteorological toolkit.<n>To catalyze progress, we introduce the first benchmark for this emerging field, comprising a curated dataset of 103 high-impact events.
arXiv Detail & Related papers (2025-11-26T14:37:25Z) - I-GLIDE: Input Groups for Latent Health Indicators in Degradation Estimation [1.034052616244602]
This paper introduces a novel framework for health indicators (HIs) construction, advancing three key contributions.<n>We adapt Reconstruction along Projected Pathways (RaPP) as a health indicator (HI) for RUL prediction for the first time, showing that it outperforms traditional reconstruction error metrics.<n>We also propose indicator groups, a paradigm that isolates sensor subsets to model system-specific degradations, giving rise to our novel method, I-GLIDE.
arXiv Detail & Related papers (2025-11-26T09:39:35Z) - Demystifying deep search: a holistic evaluation with hint-free multi-hop questions and factorised metrics [89.1999907891494]
We present WebDetective, a benchmark of hint-free multi-hop questions paired with a controlled Wikipedia sandbox.<n>Our evaluation of 25 state-of-the-art models reveals systematic weaknesses across all architectures.<n>We develop an agentic workflow, EvidenceLoop, that explicitly targets the challenges our benchmark identifies.
arXiv Detail & Related papers (2025-10-01T07:59:03Z) - Enhancing Retrieval Augmentation via Adversarial Collaboration [50.117273835877334]
We propose the Adrial Collaboration RAG (AC-RAG) framework to address "Retrieval Hallucinations"<n>AC-RAG employs two heterogeneous agents: a generalist Detector that identifies knowledge gaps, and a domain-specialized Resolver that provides precise solutions.<n>Experiments show that AC-RAG significantly improves retrieval accuracy and outperforms state-of-the-art RAG methods across various vertical domains.
arXiv Detail & Related papers (2025-09-18T08:54:20Z) - Process mining-driven modeling and simulation to enhance fault diagnosis in cyber-physical systems [5.065341495341096]
Fault diagnosis in Cyber-Physical Systems (CPSs) is essential for ensuring system dependability and operational efficiency.<n>We present a novel unsupervised fault diagnosis methodology that integrates collective anomaly detection in time series, process mining, and simulation.<n>This enables the creation of comprehensive fault dictionaries that support predictive maintenance and the development of digital twins for industrial environments.
arXiv Detail & Related papers (2025-06-26T17:29:37Z) - Real-time and Downtime-tolerant Fault Diagnosis for Railway Turnout Machines (RTMs) Empowered with Cloud-Edge Pipeline Parallelism [9.544236630555627]
An edge-cloud collaborative early-warning system is proposed to enable real-time and downtime-tolerant fault diagnosis of RTMs.
Our ensemble-based fault diagnosis model achieves a remarkable 97.4% accuracy on a real-world dataset collected by Nanjing Metro in Jiangsu Province, China.
arXiv Detail & Related papers (2024-11-04T13:49:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.